What is group spacing grouping?
Group spacing grouping is to divide all variable values into several intervals in turn, and take the variable values of this interval as a group. Group spacing grouping is the basic form of numerical data grouping.
In group spacing grouping, the value boundary between groups is called group limit, the minimum value of a group is called lower limit, and the maximum value is called upper limit; The difference between the upper limit and the lower limit is called the group distance; The average value of the upper and lower limits is called the group median, which is the representative value of a group of variable values.
Divide the group distance by the number of groups to be divided (if not specified, generally take 10).
Step 1: Determine the number of groups. How many groups is a set of data suitable for? It is generally related to the characteristics of the data itself and the amount of data. Because one of the purposes of grouping is to observe the characteristics of data distribution, the number of grouping should be moderate. If the number of groups is too small, the data distribution will be too concentrated, and if the number of groups is too large, the data distribution will be too scattered, which is not convenient to observe the characteristics and laws of data distribution. The determination of the number of groups should be aimed at displaying the distribution characteristics and laws of data. In actual grouping, the number of groups k can be determined according to the empirical formula proposed by Sturges:
Step 2: Determine the interval between groups. Group distance is the difference between the upper and lower limits of a group, which can be determined according to the maximum and minimum values of all data and the number of groups divided, that is, group distance = (maximum-minimum value) ÷ number of groups. For example, in the previous data, the maximum value is 139, and the minimum value is 107, so the group distance = (139- 107) ÷ 7 = 4.6. For the convenience of calculation, the interval between groups should be a multiple of 5 or 10, the lower limit of the first group should be lower than the minimum variable value, and the upper limit of the last group should be higher than the maximum variable value, so the interval between groups should be 5.
Step 3: Arrange the frequency distribution table according to the grouping.