机器学习速成课程Whether you know it or not, machine learning has seeped into every aspect of our lives, now is a critical time for everyone to have a basic understanding in machine learning....
Whether you know it or not, machine learning has seeped into every aspect of our lives, now is a critical time for everyone to have a basic understanding in machine learning. This article is designed to not only give you a macro perspective into how machine learning is applied, but to also give you the commonly left out micro perspective into how machine learning works.
机器学习，深度学习和AI有什么区别？ (What is the difference between Machine Learning, Deep Learning and AI?)
All of these buzz-words are thrown out all the time and are assumed to be identical in meaning. The reality is quite different.
This Venn Diagram shows that Deep Learning is a subset of Machine Learning which in turn is a subset of Artificial Intelligence.
该维恩图显示深度学习是机器学习的子集 ，而机器学习又是人工智能的子集 。
Artificial intelligence is a much broader concept of machines having “intelligence”, as we humans perceive it. Machine Learning is programming an algorithm that adapts with experience. Deep Learning is a subset of Machine Learning which consists of multi-layered Neural Networks. The main focus of this article is Machine Learning, and a special emphasis on Deep Learning.
Machine Learning can be split into to two categories: Supervised and Unsupervised learning.
Supervised learning is when a model is trained on labelled data: This means clear input data and “solutions” for the data.
Supervised learning can be split into two types: Classification and Regression.
Classification is trying to identify the class that a piece of data belongs to, Regression is drawing a best-fit line in the data that can “catch” as many points as possible.
An example of supervised learning is predicting if a human is male or female based on the measurements of the body. There is a clear solution here: the gender of the human.
要查看的算法： (Algorithms to look at:)
Adam, RMSprop, Stochastic Gradient Descent
You can think of Unsupervised learning more of the interpretation of data, or the re-arranging of data to gain valuable insight.
An example of unsupervised learning is clustering different music listeners together based on music that they’ve listened to. There is no clear solution for the data: it is simply categorising the data to gain insight.
K-means clustering, Naive Bayes Classifier, Random Forest Classifier
神经网络和深度学习简介： (Introduction to Neural Networks and Deep Learning:)
Now that you have a birds-eye view of Machine Learning, Let’s delve deeper into Deep Learning.
There are 2 in which to understand Neural Networks: A Layman’s Perspective:
To put it simply, neural networks are a flexible framework that can be easily altered to change results. You can think of a neural network as a function machine, that takes in a value and gives out an output. All that machine learning algorithms do is to change these values to get the best results possible.
This perspective retains a macro perspective of what a Neural Network really does, without getting into the nitty gritty mathematical details.
数学家的观点： (A Mathematician’s Perspective:)
A neural network is a set of matrices, and to get an output, one must do a set of matrix multiplications to end up with an output value. Machine Learning algorithms are trying to calculate a partial derivative, that explains the relationship between each matrix and the output. By using this partial derivative to alter the matrices, the accuracy of the network can be maximised.
Let’s go back to the analogy of the neural network being a function machine. First let’s simplify the neural network down to two connected neurons:
The mathematical way to describe a neural network is:
pred = w*xx = input_data w = weight pred = output
This means that the output is calculated by multiplying the input the data, by the variable w. The variable w is the heart of neural networks.
衍生产品简介： (Introducing Derivatives:)
Now let’s plot a graph for how the value of y changes, when the value of w changes (assuming x is constant):
Let’s calculate the derivative of the graph:
Derivative is just a fancy way of saying, how do you get from the value w to y?
Since pred = w*x, the derivative of w is x, as you can multiply w by x to get pred.
MSE is a calculation of how accurate a line of best fit is:
With a line of best fit, and the set of true points, this is the equation:
for every real_point: mse += (real_point-line_of_best_fit)^2
The MSE is essentially calculating the sum of all the errors and adding them together, for a general understanding of how accurate a line of best fit is.
计算导数： (Calculating Derivatives:)
The goal of the machine learning algorithm is to get a numerical value of how each weight impacts the overall accuracy of the network, and then change the weight to improve the accuracy.
Mathematically speaking, we are looking for the partial derivative of the MSE to the parameter w.
According to a mathematical theory called the chain rule, to calculate the derivative of a composite function (a nested function), one can multiply the derivatives with respect to each order of the function.
In this case, it means one just has to calculate the rate of change between the MSE and the output value, then the output_value to the parameter y and multiply them together.
Since the derivative of x^2 is 2x, the derivative of the MSE function is 2(real_points-y)We already calculated the derivative of the parameter w to be x.Therefore, the derivative of the MSE function to w is: x*2(y-pred)
This means for every change in w, the MSE function increases by x*2(y-pred). Since the MSE function is a measure of error, we are trying to find the what value of the parameter w will minimize the MSE function.
Without the mathematics standpoint, machine learning would just be a black box, with no rhyme or reason to what they do.
Code is obviously a large part of machine learning, but it is very difficult and time consuming to teach a programming language to someone. I will post some annotated python code for you to run if you are interested to delve even deeper into machine learning.
''' This is a simple neural network with just one input and one output. The goal of the network is to change the value of w to be 2, as the true value is 2 and the x_value is 1. '''x = 1 true_values = 2*x w = 1 for i in range(100): pred = w*xmse = (true_values-pred)**2 # derivative of mse and prediction dmse_dpred = (true_values-pred)*2 dpred_dw = x # Therefore dmse_dw = dmse_dpred * dpred_dw w += dmse_dw*0.1 # This value should be very close to 0, as the accuracy of the network is increasing print(mse)
After a deep dive into a micro perspective of the math behind machine learning, let’s zoom out and look at the applications of machine learning.
机器学习的工作地点： (Where Machine Learning works:)
When there is plentiful of labelled and doctored data, machine learning performs exceptionally well and usually draws groundbreaking conclusions. For example, machine learning algorithms that have learnt to classify fraudulent or legitimate transactions, after processing the billions of transactions per day, are now online banking’s first line of defense against internet fraudsters.
Machine Learning Models are very fragile and are not very robust: This means that Machine Learning models do not work for data that is scarce or extremely volatile. An example of this is to diagnose a rare disease: Since the disease does not affect many people, there is little to no data on the disease, making it impossible for the machine learning to gain true insight on the disease.
From this article, I hope you have learnt an overview of Machine Learning, how it fits into AI, categories of Machine Learning, a basic understanding of the math behind deep learning, and how to apply Machine Learning.