Current location - Training Enrollment Network - Mathematics courses - Mathematics section-statistics
Mathematics section-statistics
percentile

Calculate the percentage. Example of PDF P438.

When y=(n+ 1)*p/ 100 is not an integer, such as 12.5, then take the numbers 12 and 13, and then use (v13-v/. That is, between two numbers, take a number in proportion.

Please note that quintiles and quartiles are not necessarily exactly equal to the values converted to 20,25 percentage points. Because percentiles may be inseparable, but the 4th and 5th percentiles are separable.

Coefficient of variation: Note that this is coefficient of variation, not covariance. The definition is standard deviation/average. For example, one set of numbers is small and the other set is large, but the standard deviation is the same. These two

The standard deviation can't explain which group of numbers fluctuates more. But divided by the average, it can be explained.

Chebyshev inequality: from the average value, it deviates from plus or minus k standard deviations (k >;); The percentage of data points contained within 1) in the whole set should not be less than1-1/k 2.

Ruby:? (Portfolio Return-Risk-free Return)/Standard deviation of Portfolio Return. It measures the excess return from each risk (relative to the risk-free return).

If there is no risk-free income as a reference, then there is no comparability. For example, I make the standard deviation very small, but the income is also low, and the proportion can be relatively large. With the comparison of risk-free returns, everyone was pulled to a starting line.

Note that if the sharp ratio is negative, it is possible that the greater the standard deviation, the greater the sharp ratio (approaching to 0). In this case, it cannot be said that the greater the sharp ratio, the better.

Another application of Sharp's ratio is that he uses standard deviation to measure risk. If some models are naturally high-frequency trading, they will earn a little at a time, but they may lose a lot at lower frequencies, which may not be suitable.

Skewness of skewness:

Sk= (cube of the difference between each value and the average/cube of the standard deviation) /n? (n is larger, such as > At 100, and the formula is. N hours is n/((n- 1)(n-2))? )

Skewness measures whether the whole distribution is left or right.

Abundance peak:

Kt= (cube of the difference between each value and the average/cube of the standard deviation) /n? (n is larger, such as > At 100, and the formula is. N hours is n (n+1)/((n-1) (n-2) (n-3))? )

Relative abundance:

kt' = kt-3? 3 is the abundance of normal distribution. This value measures the abundance relative to the normal distribution.

The greater the abundance, the fatter the tail, and the greater the deviation. Generally speaking, abundance >: 1 is relatively large.

Abundance and skewness can basically measure the deviation of a historical data and where most of the values fall.

Skewness describes whether the whole data set is left or right.

Wrong experience:

1. Look at the problem casually. Look carefully. Look at the start and end dates clearly.

2. Calculate the average price through harmonic averaging. Harmonic average

Understanding of conditional probability;

The definition of conditional probability is better understood as follows: the probabilities of A and B are each a circle. The probability that A and B happen at the same time is the intersection of two circles. The intersection of two circles is required to occupy the area of circle B, that is, P(AB)/P(B). But this understanding does not explain the independence of A and B.

Note that if A and B are independent, P(A|B)=P(A) can be derived from the above two formulas. Therefore, in most scenarios, considering the conditional probability is generally that A and B are not independent of each other.

Another formula is:

P(AB) = P(A | B)P(B)? Is a simple variant of the first formula.

The example of P492 in the book (A is the probability that the return is greater than the risk-free return, 0.7, B is the probability that the return is greater than 0, 0.8) is a special case. If the income is greater than the risk-free income, then the income must be greater than 0, so 0.7 is a p (a&; B), in line with the above formula.

What we discussed earlier is sample-based statistics, which describes the concentration degree and deviation degree of a set of data.

Mathematical expectation is an expectation, a prediction, not a description of existing data. But the basis of making predictions is still some existing information. There is another explanation for the expected value, which is the average value of the sample at infinity.

Some distribution information of random variables is known, such as the probability of p 1 is v 1, the probability of p2 is v2, and so on. Then you need to predict the expected value of a random variable, which is mathematical expectation.

In the previous sample statistics, variance is used to describe the degree to which samples deviate from the mean. The variance of a random variable is defined by the following formula:

?

The variance of random variables is defined as mathematical expectation (the square of the difference between random variables and mathematical expectation).

(X-E(X))^2? There is no way to calculate this thing. Because x is a random variable. But the mathematical expectation of this thing can be calculated, based on the existing information. In the following formula, X 1, X2 to Xn all represent a sample point (or we estimate events that occur with a certain probability). P(Xn) represents the probability of occurrence of sample points.

Therefore, the above variance formula becomes:

This formula is more important.

This tree is more important. Delineating this tree is very helpful to clarify the thinking.

If a random variable consists of several random variables, each random variable has a certain weight, which is the mathematical model of portfolio.

Find the mathematical expectation of the combination, find the mathematical expectation of each component, then multiply it by the weight and add it up.

For the variance of a portfolio, it is the following formula: Rp represents the return of the portfolio.

Where w 1 represents the weight of the first random variable in the combination, and R 1 represents the first random variable.

Therefore, it is finally concluded that:

This is a beautiful N*N matrix. The diagonal is the variance of each random variable. Where cov covariance is defined as:

When i==j, covariance is variance.

Generally speaking, the expected product is not equal to the expected product unless the variables are independent of each other. So covariance cannot be understood as E(Ri-ERi) * E(Rj-ERj).

(R-ER), which may be negative or positive, represents the degree and direction that the independent variable deviates from the expectation.

The product of the expected deviation between two independent variables. The expectation of this product reflects the value of this product when the sample is infinite. When the sample space is large, the product has the following trends:

1, if two independent variables are uncorrelated (here uncorrelated means nonlinear correlation) or independent, then the product tends to 0. Because after more samples, the positive and negative will eventually balance.

2. If two variables change in the same direction, the product must be positive or negative after too many samples.

The sign of covariance indicates the correlation between two variables in change. Its value is meaningless in general mathematical problems, because the dimensions of two variables may be far apart. However, when calculating the return on investment, because the return on investment fluctuates between-100% and 100%, its value also measures the fluctuation.

For general mathematical problems, it is necessary to eliminate the influence of dimensions. The way to eliminate the dimension is to divide the covariance by the standard deviation of each variable to get one thing, which is the correlation coefficient:

Correlation coefficient = CoV (a, b)/(STD (a) * STD (b))

Note that the correlation here refers to linear correlation. A correlation coefficient of 1 indicates perfect positive correlation, 0 indicates complete irrelevance, and-1 indicates complete negative correlation.

The independent definition of two random variables: P(AB)=P(A)*P(B) is consistent with the conditional probability discussed earlier.

Definition of two unrelated random variables: E(AB)=E(A)E(B)?

Bayesian formula:

The formula is relatively simple and can be obtained through the deformation of conditional probability:

Because P(A|B)*P(B)=P(B|A)*P(A)=P(AB)

So P(A|B)=P(B|A)*P(A)/P(B)

The key is the practical significance of this formula and how to use it. When A is a known information and B is a new information, how to update the probability of A when B occurs?