Current location - Training Enrollment Network - Mathematics courses - What does deep learning need to learn?
What does deep learning need to learn?
foundations of mathematics

If you can read the mathematical formulas in the in-depth study paper fluently and independently derive new methods, it means that you have the necessary mathematical foundation.

Mastering the mathematical knowledge contained in the four mathematical courses of mathematical analysis, linear algebra, probability theory and convex optimization, and being familiar with the basic theories and methods of machine learning are the prerequisites for introducing deep learning technology. Because it is inseparable from a solid foundation of mathematics and machine learning to understand the operation and gradient derivation of each layer in the deep network, to formalize the problem, or to derive the loss function.

mathematical analysis

Calculus is the main content of advanced mathematics courses offered by engineering majors. For general deep learning research and application, it is necessary to review basic knowledge such as function and limit, derivative (especially derivative of compound function), differential, integral, power series expansion, differential equation and so on. In the optimization process of deep learning, solving the first derivative of the function is the most basic work. Speaking of differential mean value theorem, Taylor formula and Lagrange multiplier, we should not just feel familiar.

linear algebra

The operations in deep learning are usually expressed as vector and matrix operations. Linear algebra is such a branch of mathematics that studies vectors and matrices. The contents to be reviewed mainly include vectors, linear spaces, linear equations, matrices, matrix operations and their properties, and vector calculus. When it comes to Jacobian matrix and Heisenberg matrix, you need to know the exact mathematical form; When a matrix loss function is given, you can easily solve the gradient.

probability theory

Probability theory is a branch of mathematics that studies the quantitative laws of random phenomena. Random variables have many applications in deep learning, whether it is random gradient descent, parameter initialization method (such as Xavier) or Dropout regularization algorithm, it is inseparable from the theoretical support of probability theory. Besides mastering the basic concepts of random phenomena (such as random experiment, sample space, probability, conditional probability, etc.). ), random variables and their distribution, you need to know some laws of large numbers, central limit theorem, parameter estimation, hypothesis testing and so on. You can learn more about stochastic processes and Markov random chains.

Convex optimization

Combined with the above three basic mathematics courses, convex optimization can be said to be an applied course. However, for deep learning, because the commonly used deep learning optimization methods often only use one-step information for random gradient descent, practitioners do not need much "advanced" convex optimization knowledge. Knowing the basic concepts of convex set, convex function and convex optimization, mastering the general concepts of dual problems, mastering the commonly used unconstrained optimization methods such as gradient descent method, random gradient descent method and Newton method, and knowing a little equality constrained optimization method and inequality constrained optimization method can meet the theoretical requirements of optimization methods in deep learning.

machine learning

In the final analysis, deep learning is only one kind of machine learning method, and statistical machine learning is the de facto methodology in the field of machine learning. Taking supervised learning as an example, you need to master some representative machine learning techniques, such as regression and classification of linear models, support vector machine and kernel method, random forest method, understanding model selection and model reasoning, model regularization technology, model integration, Bootstrap method, probability graph model and so on. If we go further, we need to know some special technologies, such as semi-supervised learning, unsupervised learning and reinforcement learning.