Model parameters and model superparameters in machine learning are different in function and source, and model superparameters are often called model parameters, which is easy to confuse beginners. In this paper, the definitions of model parameters and model superparameters are given and compared, and the essential differences between them are pointed out: model parameters are configuration variables within the model, and the values of model parameters can be estimated by data; Model superparameter is a configuration outside the model, and the parameter value must be set manually.
When we do research, we will come across many terms. Sometimes, terms with the same name will appear in different research fields. For example, "model parameters" and "model superparameters" often used in statistics and economics also exist in machine learning.
The functions and sources of "model parameters" and "model superparameters" in the field of machine learning are different. If beginners don't have a clear understanding of them, they will usually find it difficult to learn, especially those from statistics and economics.
In order to let everyone have a clear definition of "parametric model" and "hyperparametric model" when applying machine learning, we will discuss these two terms in detail in this paper.
First, let's look at what "parameters" are.
As a part of the model learned from historical training data, parameters are the key of machine learning algorithm.
"Parameters" in statistics:
In statistics, you can assume the distribution of a variable, such as Gaussian distribution. The two parameters of Gaussian distribution are mean (μ) and standard deviation (sigma) respectively. This is effective in machine learning, and these parameters can be estimated by data and used as part of the prediction model.
"Parameters" in programming:
In programming, parameters can be passed to functions. In this case, the parameter is a function parameter and can have a range of values. In machine learning, the specific model you are using is a function, and you need parameters to predict new data.
What is the relationship between "parameter" and "model"?
According to the classic machine learning literature, the model can be regarded as a hypothesis, and the parameters are specific adjustments to the hypothesis according to a specific data set.
Whether the number of parameters of a model is fixed or variable determines whether the model is a "parametric" model or a "nonparametric" model.
What are model parameters?
Simply put, model parameters are configuration variables within the model, and their values can be estimated by data.
Specifically, the model parameters have the following characteristics:
Model prediction needs model parameters.
Model parameter values can define model functions.
Model parameters are obtained by data estimation or data learning.
Model parameters are usually not set manually by practitioners.
Model parameters are usually saved as part of the learning model.
Optimization algorithm is usually used to estimate model parameters, and optimization algorithm is an effective search for possible values of parameters.
Some examples of model parameters include:
Weight in artificial neural network.
Support vector in support vector machine.
Coefficient in linear regression or logistic regression.
What is a model superparameter?
Model superparameter is a configuration outside the model, and its value cannot be estimated from the data.
The specific features are:
Model superparameters are often used in the process of estimating model parameters.
Model superparameters are usually quoted directly by practitioners.
Model superparameters can usually be set by heuristic methods.
The model superparameter is usually adjusted according to the given forecasting modeling problem.
How to get its optimal value: for a given problem, we can't know the optimal value of the model superparameter. But we can use the rule of thumb to find its optimal value, or copy the values used in other problems, or through trial and error.
Some examples of model superparameters include:
The learning rate of training neural network.
C and sigma superparameters of support vector machine.
In the neighborhood of k.
"Model Parameters" and "Model Hyperparameters"
The connection between the two:
When adjusting the machine learning algorithm for a specific problem, such as using grid search or random search, you will adjust the superparameter of the model or command to find a model parameter that can make the model predict most skillfully. The important parameters in many models cannot be directly estimated from the data. For example, in the K nearest neighbor classification model ... this type of model parameter is called adjustment parameter, because there is no available analysis formula to calculate the appropriate value.
Distinguish:
Model superparameters are usually called model parameters and are easily misunderstood. A good rule of thumb to solve this problem is as follows: If you have to specify a "model parameter" manually, it may be a model superparameter.
abstract
After reading this article, we can understand the clear definition and difference between model parameters and model superparameters.
In a word, the model parameters are estimated automatically according to the data, while the model hyperparameters are set manually and used in the process of estimating the model parameters.