Mode, the numerical value with obvious concentration of trend points in statistical distribution, represents the approximate level of data (mode may not exist or there may be more than one).
Revision definition: it is the value with the highest frequency in a set of data, which is called the mode. Sometimes there are several patterns in a set of numbers. Represented by m.
Rational understanding: in short, it is the largest number in a set of data.
It is not reliable to represent a set of data by a pattern, but the pattern is not affected by extreme data, and the solution is simple. In a group of data, if individual data have great changes, it is more appropriate to choose the median to represent the "concentration trend" of this group of data.
Application example: it is especially useful when there is no obvious order between numerical values or observed values (which often appears in non-numerical data), because the arithmetic mean and median may not be well defined. Example: the plural of {chicken, duck, fish, fish, chicken, fish} is fish.
Median, the value in the middle after data sorting. That is, the data is divided into two parts, one part is greater than this value and the other part is less than this value.
Median position: when the number of samples is odd, the median = (n+ 1)/ the second data; When the number of samples is even, the median is the arithmetic average of the n/2nd data and the N/2+ 1 th data.
Rational understanding: arrange a set of data in numerical order from small to large, and the middle number (or the average of two numbers) is called the median of this set of data.
Median can avoid extreme data, and represents the medium situation of the overall data.
Application example: When the price rises, appropriately raising the pension standard of enterprise retirees and the wages of on-the-job employees is conducive to ensuring their basic life and gradually improving their quality of life. However, only providing an "average" always makes people feel a little uneasy. An average will cover up many problems. Not long ago, netizens wrote such a jingle: "Zhang Cun has a population of one million, and there are nine poor people next door. On average, everyone is one million. " For such a problem, it is neither the "average" fault nor the statistical fault. There is a ready-made solution in statistics, which is to calculate the "median". The so-called "median", taking a 5 1 person enterprise as an example, ranks the annual income of all employees from large to small, and the annual income of the middle one, that is, the 26th, is the median annual income of this enterprise. The median personal property of "Zhang Cun" in jingle is "zero". At this time, the problem that the average can't explain, the median is clear.
Note: it is from small to large, not randomly arranged.
Range is the difference between the largest data and the smallest data in a set of data. In statistics, extreme range is often used to describe the dispersion degree of a set of data, also known as full scale or extreme range. Reflect the variation range and discrete range of variable distribution, and the difference between the standard values of any two units in the population cannot exceed the extreme range. At the same time, it can reflect the amplitude of a group of data fluctuations.
such as
12 12 13 14 16 2 1
The extreme difference in this number is
2 1- 12=9
Range only indicates the maximum discrete range of the measured values, and fails to use all the information of the measured values and reflect the consistency between the measured values in detail. Range is a biased estimate of the population standard deviation, which can be used as an unbiased estimate of the population standard deviation when multiplied by the correction coefficient. Its advantages are simple calculation, intuitive meaning and convenient application, so it is still widely used in data statistical processing. However, it only depends on the level of two extreme values, and cannot reflect the distribution of variables between them, so it is easily influenced by extreme values.
Variance and standard deviation. The average value of the sum of squares of the difference between the data in the sample and the average value of the sample is called sample variance. The arithmetic square root of sample variance is called sample standard deviation. Sample variance and sample standard deviation are both measures of sample fluctuation. The greater the sample variance or standard deviation, the greater the fluctuation of sample data.
Variance and standard deviation are the most important and commonly used indicators to measure discrete trends. Variance is the square of variance of each variable value and the average of its mean, which is the most important method to measure the dispersion degree of numerical data. The standard deviation is the square root of variance, expressed by S.
The difference between standard deviation and variance is that the calculation unit of standard deviation and variable is the same, which is clearer than variance, so we often use standard deviation more in analysis.
Application example: A computer Chinese character input speed competition was held in Class A and Class B of Grade Three in a school. Students in two classes input the number of Chinese characters per minute. After statistics and calculation, the results are shown in the following table:
Variance of the average number of words in a class.
a 55 135 149 19 1
b55 135 15 1 10
A classmate draws the following conclusions according to the above table:
① The average level of students in Class A and Class B is the same;
② The number of excellent students in Class B is more than that in Class A (every minute 150 Chinese characters or more is considered excellent);
③ The fluctuation of students' grades in Class A is greater than that in Class B, and the correct conclusion is _ _ _ _ _ (fill in serial number).
Solution: Fill in ①, ② and ③. Obviously ① and ③ are correct. For the second conclusion, because the median of A is 149, it means that the number of outstanding students in Class A is less than half, while the median of B is 15 1, which means that the number of outstanding students in Class B is more than half, so the number of outstanding students in Class B is more than that in Class A..