Naive Bayes in Ten Classical Algorithms of Data Mining

Naive Bayes, which is a simple but extremely powerful prediction modeling algorithm. It is called naive Bayes because it assumes that each input variable is independent. * * This assumption is very hard, which is not satisfied at all in real life, but this technology is still very effective for most complex problems.

Bayesian principle, Bayesian classification and naive Bayes are different.

Bayesian principle is the biggest concept, which solves the problem of "inverse probability" in probability theory. On the basis of this theory, people designed a Bayesian classifier. Naive Bayesian classifier is a kind of Bayesian classifier, and it is also the simplest and most commonly used classifier. Naive Bayes is naive because it assumes that the attributes are independent of each other, so it has constraints on the actual situation. * * If there is correlation between attributes, the classification accuracy will decrease. * * Fortunately, in most cases, the classification effect of Naive Bayes is good.

Naive Bayesian classifier relies on accurate natural probability model, which can achieve very good classification effect in supervised learning sample set. In many practical applications, the parameter estimation of naive Bayesian model adopts the maximum likelihood estimation method, in other words, naive Bayesian model can work without Bayesian probability or any Bayesian model.

Naive Bayesian classification is often used in text classification, especially for English and other languages, the classification effect is very good. It is often used in spam filtering, sentiment prediction, recommendation system and so on.

1, need to know the prior probability?

Prior probability is the basis of calculating posterior probability. In the traditional probability theory, the prior probability can be approximated by the frequency of various samples obtained by a large number of repeated experiments, which is based on the "law of large numbers", the so-called "frequency doctrine". In the school of mathematical statistics called Bayesian, they think that time is one-way and many events cannot be repeated, so the prior probability can only be given according to the subjective judgment of confidence, or it can be said that it is determined by "belief". ?

2. Correct the prior probability according to the obtained information?

Without any information, if we want to classify and judge, we can only divide the samples into a class with high prior probability according to the prior probability of each class. After obtaining more sample feature information, the prior probability can be modified according to Bayesian formula to get the posterior probability, thus improving the accuracy and confidence of classification decision. ?

3. Is there an error rate in classification decision?

Because Bayesian classification guesses the probability that a sample belongs to each category when it obtains a certain eigenvalue, it can't get the true category attribution of the sample, so there must be an error rate in the classification decision, even if the error rate is very low, classification errors may occur. ?

The first stage: preparation stage

At this stage, we need to determine the feature attributes and make clear what the predicted values are. And each feature attribute is properly divided, and then a part of data is manually classified to form a training sample.

The second stage: the training stage

This stage is to generate classifiers. The main task is to calculate the frequency of each category in the training samples and the conditional probability of each feature attribute division of each category.

The third stage: the application stage

At this stage, the classifier is used to classify the new data.

Advantages:

(1) Naive Bayesian model is derived from classical mathematical theory and has stable classification efficiency.

(2) It performs well on small-scale data, can handle multi-classification tasks, and is suitable for incremental training, especially when the data exceeds the memory, we can do incremental training in batches.

(3) It is insensitive to missing data, and the algorithm is relatively simple, so it is often used for text classification.

Disadvantages:

(1) Theoretically, compared with other classification methods, naive Bayesian model has the smallest error rate. But in fact, this is not always the case, because the naive Bayesian model assumes that the attributes are independent of each other when the output category is given, but it is often not true in practical application. When the number of attributes is large or the correlation between attributes is large, the classification effect is not good. However, when the attribute correlation is small, the performance of naive Bayes is the best. For this, some algorithms such as semi-naive Bayes can be improved moderately by considering partial correlation.

(2) We need to know the prior probability, and the prior probability often depends on assumptions, and there are many hypothetical models, so the prediction effect will be poor in some cases because of the assumption of prior models.

(3) Because we decide posterior probability and classification through prior and data, there is a certain error rate in classification decision.

(4) Sensitive to the expression of input data.

Reference:

/Qiu _ zhi _ Liao/ article/details /9067 1932

/u 0 1 1067360/article/details/24368085

What school-based courses are there?

The plan of improving make-up homework in second grade mathematics

What is paradox? Briefly describe your understanding of mathematical theory.

What is the definition of the complex number i*i=- 1?

On how to use curriculum standards to guide students to learn junior high school mathematics.

How is Cheng Wei's peak mathematics?

What is the definition of mathematics?

Recommend a book on mathematical calculation in junior high school.

If math tests encourage children.

What should I do if I do slowly in the math exam?