Since Bayesian algorithm solves the problem of inverse probability, we need to know what is positive probability and what is inverse probability first. Let's take touching the ball commonly used in middle school math class as an example.
Suppose there are m white balls and n black balls in the bag, and randomly draw a ball. What is the probability of drawing a white ball? This question must be easy for everyone. The probability of finding the white ball is M/(M+N).
The above is the solution of the long-term probability. We know the distribution of black balls and white balls in the bag in advance, so we can easily find out the probability of encountering white balls and black balls.
If we don't know the ratio of black balls to white balls in the bag in advance, but close our glasses and touch out some balls. Then observe the ratio of white ball to black ball in the contacted ball, and infer the ratio of black ball to white ball in the bag.
This is the reverse probability, and we don't know the distribution of black balls and white balls. In the real world, reverse probability is also widely used, because human observation ability is limited, such as observing the diversity of marine life and calculating the probability of defective products in a batch. And we can't count all the samples.
Let's look at Bayesian formula first. We don't need to remember it. We just need to have a general impression first.
Here comes the scene: suppose that the ratio of male to female in a middle school is 60%: 40%, boys always wear trousers, and girls wear trousers and skirts.
What is the probability P(Girl|Pants) of that girl wearing pants?
Step 1: We need to know how many people are wearing pants. Suppose the total number of students in the school is m, and the number of students wearing pants is the number of boys wearing pants+the number of girls wearing pants, that is, m * P (boys) * p (boys with pants) +m * p (girls with pants) * p (girls with pants), where p (boys with pants) and p (girls with pants) are the conditions.
Step 2: Next, we need to know the number of girls wearing pants, first girls, then girls wearing pants, so girls wearing pants are M*P (girls) *P (pants | girls).
Step 3: Calculate the probability p (girls | pants) of girls wearing pants, that is, m * p (girls) * p (pants | girls) /(m * p (boys) * p (pants | boys) +m * p (girls) * p (pants | girls). Divide by m to get p (girls) * p (pants | girls) /(p (boys) * p (pants | boys) +p (girls) * p (pants | girls)).
Through the disassembly of the above three steps, we finally get:
P (girls | pants) = P (girls) *P (pants | girls)/(P boys) P (pants | boys)+P (girls) P (pants | girls))
The denominator (p (boys) * P (pants | boys) +p (girls) * p (pants | girls)) is the probability of wearing pants, which can be expressed as p (pants), and the formula for calculating the probability of girls wearing pants becomes:
P (girls | pants) = P (girls) *P (pants | girls)/P (pants)
In this way, the inverse probability can be obtained, and the probability that the person wearing pants is a girl can be converted into a positive probability. Replacing Girl and Pants with A and B is the Bayesian formula given by Xiaoyu before.