Current location - Training Enrollment Network - Mathematics courses - How can product managers improve the accuracy and recall of search?
How can product managers improve the accuracy and recall of search?
For search business, the unavoidable indicators are accuracy and recall.

From the point of view of product manager, do these two indicators need to be improved, and how to cooperate with R&D engineers?

As shown in the figure:

We hope to select the correct part (T) from the complete set, but it will inevitably be mixed with some misjudgments (wrong is regarded as right), so we get the accuracy and recall rate.

The basic model of search is the process of indexing the document content by word, and then responding to the user's search words, and filtering out the corresponding documents according to the index structure. That's why when we talk about search, we can't avoid mentioning accuracy and recall.

For the optimization of pure data indicators, the algorithm depends on three things: more accurate and larger sample data; A more optimized model; More powerful computing power.

More optimized models tend to be close to the model and mathematical level. When we apply the updated model, we can usually get better classification and prediction results. From the early decision tree and machine learning methods to today's deep learning, the complexity of the model has gradually increased, and the output effect of the model has also been greatly improved.

Powerful computing power, from the ideal state to the real state, the optimized computing model is supported by computing power. As the saying goes, "one effort will reduce ten opportunities", only with strong enough computing power as the backing can we support the online and industrial application of the model. Take the familiar search engine Google as an example. The daily power consumption of Google Cloud Computing Center is equivalent to that of the whole city of Geneva, Switzerland.

More accurate and larger sample data. The model supported by computing power is essentially fitting, that is, obtaining the highest possible score in the "model test" of sample data again and again. As the object of model fitting, sample data can only be rich and accurate enough to "train" a good algorithm. Otherwise, just like the old saying "garbage goes in, garbage goes out", the noisy data provided to the algorithm model will only cause the model output to be in a mess.

It is not difficult to see from the above three items that more and better sample data are related to product managers.

So we will continue to label, just like teaching children, and tirelessly accumulate enough positive and negative samples for the algorithm as the basis of calculation, thus improving the effectiveness of the algorithm. In specific industrial applications, we can also build various directional dictionaries, such as urban dictionary's dictionary, company dictionary, star dictionary, etc., and improve the data quality on which the algorithm depends by inputting nearly standard answers.

What is the product manager thinking? User satisfaction in specific scenarios.

When users use search, they need to get satisfactory results in the shortest time.

Accuracy and recall are basic descriptions of the search results that users are satisfied with, but they are not complete descriptions.

That is, the accuracy and recall can't not exceed a certain threshold, but when it exceeds a certain threshold, the contribution of these two indicators to user satisfaction will decrease marginally.

Product managers need to think more about why users are satisfied with this situation.

In a typical example, the user searches for "weather". All he needs is an accurate weather forecast.

Users don't care about the recall rate, that is, users don't care whether we recall 9W or 9.9W from the weather website of the whole network 10W. Weak water is 3 thousand, and a spoonful is enough.

Furthermore, when the user's search intention is particularly clear, we can even put the results directly on the search results page without letting the user click on the link, which is the significance of Baidu's search for Aladdin card.

From the perspective of improving user satisfaction, there will be more problems to be considered.

For example, when users search for "iPhone", their intentions are relatively unclear.

Does he want to buy an iPhone and read reviews about it, or does he want to visit Apple official website?

In this case, we can only sort the results according to the statistics and the preferences of most people.

What about further?

It is the scene that the product manager should exert his strength, and how to refine the user's intention through interaction.

This is what we are going to see today. Related searches will be inserted into the search results, allowing users to refine their intentions more quickly through the suggestions of related searches, thus giving the machine more effective input.

Indicators are very important, allowing different teams to reach an agreed evaluation index around the business;

The index is not unique. The physical world is three-dimensional and rich. The index is our dimensionality reduction and data fitting for the physical world, and some information is bound to be lost in the process.

For example, in terms of accuracy and recall, we have the following preconditions.

But in fact, many search results are highly homogeneous and can replace each other; In different results, the more authoritative the website is, the higher the weight of the results it provides.

Therefore, for strategic product managers. Of course, we can pursue indicators, but in addition to pursuing indicators, we also need to stand on the dimension of time and reflect and reconstruct the scene in stages:

Think about what is special about the user's needs in the current scenario? Do you need new strategies to satisfy it? Do you need new indicators to measure? Can bring more possibilities.