    This series of post will list some good articles about how to implement a neural network. Thanks for the authors for the excellent work. 
    How to implement a neural network Part 1


    This page is part of a 5 (+2) parts tutorial on how to implement a simple neural network model. You can find the links to the rest of the tutorial here:


    The tutorials are generated from Python 2 IPython Notebook files,

  • Good blogs to learn machine learning and data sciense

    • Occam’s Razor by Avinash Kaushik, examining web analytics and Digital Marketing.
    • OpenGardens, Data Science for Internet of Things (IoT), by Ajit Jaokar.
    • O’reilly Radar O’Reilly Radar, a wide range of research topics and books.
    • Observational Epidemiology A college professor and a statistical consultant offer their comments, observations and thoughts on applied statistics, higher education and epidemiology.
    • Overcoming bias By Robin Hanson and Eliezer Yudkowsky. Present Statistical analysis in reflections on honesty, signaling, disagreement, forecasting and the far future.
    • Probability &
  • Parameter Server 资料汇总


    parameter server 介绍
    看看李沐的文章 《Parameter Server for Distributed Machine Learning》里面有包含他的框架的一些介绍。
    后面有看到微软研究院 project Adam的论文,大体思路比较相似,但论文中细节比较丰富,也会互补的一些信息描述下。


    1. 参数很大,超过单个机器的容纳能力(比如大型Logistic Regression和神经网络)
    2. 训练数据巨大,需要分布式并行提速(大数据)

    因此需要自己实现分布式并行程序,其实在Hadoop出来之前,对于大规模数据的处理,都需要自己写分布式的程序(MPI)。 之后这方面的工作流程被Google的工程师总结和抽象成MapReduce框架,大一统了。


    Parameter Server(Mli)



    并行计算这部分主要在计算节点上进行。 类似于MapReduce,分配任务时,会将数据拆分给每个worker节点。


    分发训练数据 -> 节点1




  • Popular Python libraries for Data Science and Machine Learning

    Python is almost a-must-have skill for data scientist, as you can see many data scientist positions require python programming skills. This post introduces some of the most popular python modules for data science. They are widely used to conducted projects related to data mining and machine learning, and normal data analysis.

    1. SciPy. SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering. It provides a wide range of algorithms and mathematical tools for data scientist. 

    2. NumPy. NumPy is the fundamental package for scientific computing with Python. 

  • 8 kills you should learn to be a data scientist

    8 types of data science jobs with a breakdown of the 8 skills you need to get the job

    Data Scientists get assigned different names in different organizations. Contrary to popular belief, data science is not entirely about numbers, though it is a lot about them. A statistician, an astrologer, a survey designer, a biostatistician all play a data scientist’s role at some point without being known as one.

    There are a number of programming languages and software applications that support data analysis functions and they require different levels of programming skills. The following section explores different types of data scientists and corresponding functions performed by them:

    7 Types of Data Scientist
    1) Data Scientist as Statistician

    This is data analysis in the traditional sense.

  • Data Scientice and Machine Learning Interview Questions

    Here are some Data Science and Machine Learning related Interview Questions asked by big companies such as Facebook, Amazon, Microsoft, Yelp, Pinterest, Square, Google, Glassdoor and Groupon.  I also post an article that briefly describes the popular machine learning interview questions.

    1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and get 1 tail and 5 head. Determine whether it’s fair or not. What’s your confidence value?

    2. Given Amazon data, how to predict which users are going to be top shoppers in this holiday season.


  • Popular Machine Learning Interview Questions

    We list some popular Questions related to Machine Learning. You should prepare them if you are looking for jobs related to Machine Learning Engineers, Data Scientist or Research Scientist related to Machine Learning. 

    I put the questions into three categories: Machine Learning Theories, Machine Learning Algorithms and Machine Learning Tools. 

    Machine Learning Theories

    When we talk about machine learning theories, we often refer to machine learning models such as Support Vector Machines, Decision Trees, Logistic Regression, Topic Models, Bayesian Networks and Deep Learning. 

    Here are some books that must be read:

