When the number of observations is large, in order to understand the distribution law of a group of homogeneous observations and facilitate the calculation of indicators, a frequency distribution table can be compiled, which is called frequency table for short.
(1) Find the range: find the maximum and minimum values in the observed values, and the difference is the full range (or extreme range), which is represented by R.
(2) Determination of group segment and group spacing: Determine the number of "groups" according to the sample content, which is generally 8- 15. When there are fewer observation units, the number of groups may be relatively small, while when there are more observation units, the number of groups may be relatively large. The group spacing is rounded by the total distance of110, which is convenient for summary calculation. The first group should include the minimum values of all observed values, and the last group should include the maximum values of all observed values, and write down their lower and upper limits. The starting point and the ending point of each segment are called the lower limit and the upper limit respectively. A segment contains the lower limit but not the upper limit, and its median value in the group is (lower limit+upper limit) /2 of the segment. The difference between the lower limits of two adjacent groups of line segments is called group distance.
(3) List marking: determine the group boundaries and list them in the form of Table 2. 1, and summarize the original data by computer or marking method to get the number of observation cases in each group, that is, the frequency. Columns (1) and (3) in the table are required frequency tables.
frequency table
frequency table
2. Frequency distribution characteristics
From the frequency table, we can see two important characteristics of frequency distribution: concentration trend and dispersion. There are some heights, but most people's heights are concentrated in the middle, and most of them are of medium height, which is a concentrated trend; The frequency distribution from middle height to shorter or higher gradually decreases, reflecting the degree of dispersion. For numerical variable data, we can analyze its regularity from two aspects: concentration trend and dispersion degree.
3. Types of frequency distribution
Frequency distribution can be divided into symmetrical distribution and skewed distribution. Symmetrical distribution means that most frequencies are concentrated in the center, and the frequency distribution at both ends is roughly symmetrical. Skew distribution means that the frequency distribution is asymmetric and the concentration position is biased to one side. If the concentration position is biased to the side with small value, it is called positive skew distribution. The concentration position is biased to the side with large value, which is called negative skew distribution. For example, the age distribution of patients with chronic diseases such as coronary heart disease and most malignant tumors is negatively skewed. Clinical data of normal and skewed distribution are common. Different distribution types should adopt different statistical analysis methods.
4. Use of frequency meter
It can reveal the distribution types and characteristics of data, so as to choose appropriate statistical methods; It is convenient for further calculation of indicators and statistical processing; It is convenient to find some suspicious values that are particularly large or small.