In the high number part, the gradient is mainly solved by derivative and differential, the objective function is estimated by Taylor expansion (multidimensional), and the convergence rate is calculated by limit. In practical application, the objective function is usually presented in the form of multivariable function, and multivariable and relational expressions can generally be expressed by vectors and matrices.
So we should be familiar with the calculation of vectors and matrices, especially their derivatives, such as gradient (the first derivative of vectors) and Hessian matrix (the second derivative of vectors).
In the part of linear algebra, the main research is the calculation of vectors and matrices, especially the calculation of derivatives of vectors and matrices. This is because in the optimization algorithm, it is usually necessary to calculate the gradient, which can be obtained by differentiating the vector or matrix. In addition, linear algebra also involves the inverse operation of matrices, which is very useful in solving linear equations.
In addition to the above-mentioned basic mathematical knowledge, the optimization method also needs the help of various numerical calculation software or libraries to carry out efficient numerical calculation and optimization. These software or libraries include but are not limited to MATLAB, NumPy and SciPy libraries of Python, etc. Generally speaking, the mathematical basis of optimization methods is the knowledge of high numbers and linear algebra, and numerical calculation and optimization can be carried out more efficiently with the help of various numerical calculation software or libraries.
Common algorithms for numerical optimization;
Including gradient descent method, Newton method, quasi-Newton method, trust region method and so on. These algorithms are mainly aimed at unconstrained optimization problems, while for constrained optimization problems, other methods such as penalty function method and augmented Lagrangian function method are needed. In addition, there are some commonly used optimization algorithms, such as genetic algorithm, particle swarm optimization, simulated annealing algorithm and Bayesian optimization.
The gradient descent method can choose a reasonable parameter updating direction. The descent speed is slow, and it may fall into local minima. Newton method converges to the second order, and the convergence speed is faster. Every step needs to solve the inverse matrix of Hessian matrix of the objective function, and the calculation is complicated. Generally speaking, different optimization algorithms have their own advantages and disadvantages, so it is necessary to choose the appropriate algorithm according to specific problems and solving requirements.