You can choose the following three ways:
That is, each variable is divided by the full range of its value, and the range of each variable after standardization is limited to [- 1, 1].
That is, the difference between each variable and the minimum value of the variable is divided by the full distance of the variable, and the value range of each variable after standardization is limited to [0, 1].
, that is, the value of each variable is divided by the maximum value of the variable, and the maximum value of the standardized variable is 1.
Using extreme value method to dimensionless variable data is to convert the original data into data in a specific range by taking the maximum and minimum values of variables, thus eliminating the influence of dimensions and orders of magnitude. Because the extreme value method is only related to the maximum and minimum values of variables in the dimensionless process of variables, it has nothing to do with other values, which makes this method rely too much on two extreme values when changing the weight of each variable.
To calculate, that is, the difference between the value of each variable and its average value divided by the standard deviation of the variable. After dimensionless, the average value of each variable is 0 and the standard deviation is 1, thus eliminating the influence of dimension and order of magnitude. Although this method makes use of all the data information in the dimensionless process, after dimensionless, this method not only makes the mean value of the converted variables the same, but also makes the standard deviation the same, that is, dimensionless also eliminates the difference of the variation degree of variables.
This method not only eliminates the influence of dimension and order of magnitude, but also retains the difference information of each variable.
(4) Standard deviation method
. This method is a variation on the basis of standardized method, and the difference between them is only the average value of each variable after dimensionless. The mean value of each variable after standardization method is 0, and the mean value of each variable after standard deviation method is the ratio of the mean value of the original variable to the standard deviation.
To sum up, for different types of data, you can choose the corresponding dimensionless method. The following example is a typical dimensionless example in the evaluation system.
In recent years, the eutrophication of freshwater lakes in China has become increasingly serious. How to comprehensively evaluate and control lake eutrophication is a subject before us. The following two tables are the measured data of five lakes in China and the evaluation criteria of lake water quality.
Table 1 Measured Data of Evaluation Parameters of Five Lakes in China
Table 2 Evaluation Criteria of Lake Water Quality
(1) Based on the above data, the effects of total phosphorus, oxygen consumption, transparency and total ammonia on eutrophication assessment of lake water quality were analyzed.
(2) Comprehensively evaluate the water quality of these five lakes and determine the water quality grade.
Before the comprehensive evaluation, the evaluation index should be analyzed first. Usually, evaluation indicators are divided into benefit, cost and fixed indicators. Benefit-oriented indicators refer to those statistical indicators with great influence (also called positive indicators); The cost index is the index whose value is smaller, the better (also called reverse index); A fixed index is an index whose value is closer to a constant, the better (also called a moderate index). If the attributes of each evaluation index are different, it is easy to appear deviation in comprehensive evaluation, and the attributes of each evaluation index must be unified first.
(i) Establishing a dimensionless measured data matrix and an evaluation standard matrix, wherein the measured data matrix and the grade standard matrix are as follows:
Then establishing a dimensionless measurement data matrix and a dimensionless grade standard matrix, wherein
get
(2) Calculate the weight of each evaluation index.
Calculate the average value and standard deviation of each row vector of matrix B,
Finally, the coefficient of variation is normalized, and the weight of each index is obtained as follows
(3) Establish a comprehensive evaluation model for the water quality of each lake.
Usually, the distance between vectors can be used to measure the closeness between two vectors. In Matlab, there are the following function commands to calculate the distance between vectors.
Dist(w, p): Euclidean distance between each row vector in calculation and each column vector in calculation;
Mandist(w, p): absolute distance.
Euclidean distance between each row vector and each column vector in calculation,
The first lake belongs to the first level.
This shows that Hangzhou West Lake and Wuhan East Lake are extremely eutrophic, Qinghai Lake is moderately eutrophic, and Chaohu Lake and Dianchi Lake are eutrophic.
The first lake belongs to the first level.
The evaluation results are completely consistent with those obtained by Euclidean distance.
Therefore, it can be seen from the above calculation that although the meanings of Euclidean distance and absolute distance are completely different, the evaluation grade of lake water quality is the same, which shows the stability of this method.
You old iron form a habit, after reading a praise, pay attention! ! !
You old iron form a habit, after reading a praise, pay attention! ! !
You old iron form a habit, after reading a praise, pay attention! ! !