Current location - Training Enrollment Network - Mathematics courses - Does artificial neural network provide transparent algorithm?
Does artificial neural network provide transparent algorithm?
Artificial neural network, referred to as neural network, is a mathematical model or calculation model that imitates the structure and function of biological neural network. In fact, it is an algorithm very similar to Bayesian network.

Neural network is a method that can be used to do supervised tasks, such as classification and visual recognition, and can also be used to do unsupervised tasks. First, let's look at a simple example. As shown in the following figure (this figure has been quoted by many people on the Internet, but I can't find the source, please correct me), if we want to train an algorithm to identify cats or dogs, this is a very simple classification task, we can find a line (model) and make a "one size fits all" in this binary coordinate to separate the two sets of data. As we know, in analytic geometry, this straight line can be expressed by the following formula:

A simple neural network

Here W 1 and W2 are the coefficients on two coordinate axes, which can be called weights. W0 can be called intercept or offset. A new data point, that is, a set of input values (X 1, X2), is a dog if it is on the left of this line, and a cat if it is on the right. This can be expressed by a simple neural network. As shown in Figure 2, X 1 and X2 are input values, y is output value, and the weights on both sides are W 1 and W2 respectively. This is the simplest neural network. This is the use of neural networks to define linear classifiers. The circular node here is a neuron. Another way is to add an intermediate node S between input and output, and then add an output layer, including two nodes, Y 1 and Y2, which correspond to cats and dogs respectively. Which output node has the largest value, then the data belongs to which category (cats or dogs).

This can be solved by a simple binary classification problem. But in practice, there are many problems that cannot be simply solved in a "one size fits all" way. As shown in Figure 3, assuming that the data distribution of cats and dogs is as shown in the following figure, this cannot be solved in a "one size fits all" way, but we can cut two knives horizontally and one knife vertically, and then combine the same "blocks", thus solving the more complicated classification problem. There are still some problems that need to be divided by curves. In this case, we need a more complex neural network. Taking the curve as an example, we can design a three-layer neural network. This is a nonlinear classifier designed by neural network. Theoretically, any classifier can be represented by neural network, that is, we can design a neural network to fit it, regardless of the actual graph. At this point, some people may ask, how should each node choose this function? According to Wu Jun's Beauty of Mathematics, the second edition, in order to provide the universality of artificial neural networks, we generally stipulate that the function of each neuron can only make a nonlinear transformation of its input variables. For example, if the input value of neuron Y is X 1, X2, ... Xn, and the weights of their edges are W 1, W2, ... Wn, then calculating the value of node Y is divided into two steps. The first step is to calculate the linear combination based on the input values:

Step 2, calculate y=f(G), where f (? ) can be made nonlinear, but because the parameter inside is a specific value, it will not be very complicated. The combination of these two steps makes the artificial neural network flexible and not too complicated. F (? ) is the activation function. The expressive ability of linear model is not enough, and its function is to enhance the expressive ability of the model. Artificial neural network can connect many layers, so the main work in artificial neural network is to design the structure (grass-roots network, several nodes in each layer, etc. ) and activation function. Commonly used activation functions include Sigmoid function, ReLU function, Tanh function and so on. As shown in the figure below, this is a schematic diagram of several simple activation functions.