Current location - Training Enrollment Network - Mathematics courses - Mathematical Basis of Artificial Intelligence15-Deviation and Variance
Mathematical Basis of Artificial Intelligence15-Deviation and Variance
When discussing the prediction model, the prediction error (? Prediction error) can be divided into three categories:

1, the error caused by bias.

2. Errors due to differences in "differences"

3. Noise error caused by "noise"

So: error = deviation+variance+noise. Below we mainly discuss bias and variance.

The definition and difference of deviation and variance are shown in the figure below.

Error caused by bias: when the data is sufficient and the model is not close enough to express the data law, the predicted value of the model will deviate far from the correct value and the prediction accuracy will be very low. This phenomenon is called under-fitting. Simply put, if the model is wrong, there will be deviations. Deviation measures the ability of the model to conform to the law of training data.

When the deviation is large, even the training set can't be fitted, so we should choose a new network, such as hidden layer or more hidden units.

Error caused by variance: variance reflects the generalization ability of the model. If a model is valid for training data and invalid for test data, it means that the generalization ability of the model is poor, or the error caused by variance is large, which also means that the model over-fits the training data. The smaller the variance, the higher the generalization ability of the model, and the better the prediction effect for new data with the same distribution law.

When the variance is high, the best solution is to use more data; If more data cannot be obtained, over-fitting can be reduced by regularization.

The relationship among model complexity, deviation, variance and error is shown in the following figure:

As shown in the figure, an optimal model has moderate complexity, relatively balanced variance and bias, and minimum overall error.

Each classifier has a minimum error rate, which is called Bayesian error rate. When the error rate of the model is close to Bayes error rate, we think that the error rate of the model is qualified or close to optimal.

refer to

Understanding Bias-Variance Tradeoffs

Bias-variance tradeoff