On the question of how many letters to type a Chinese character, it can actually be proved that only two keys are needed to input a Chinese character on average (this is obtained according to Shannon's first theorem. This part can refer to Wu Jun's Beauty of Mathematics, and the online version can refer to Series 23: How many keys are needed to input a Chinese character). In order to achieve this goal, a long time ago, we used the method of five strokes to solve it-"transforming the original source symbols into new code symbols, making the code symbols obey the equal probability distribution as much as possible, making each code symbol carry the maximum information, and then transmitting the source information with as few code symbols as possible."
But this method (five strokes and two strokes) is actually not needed at present. The main reason is that Wu Jun pointed out in "The Beauty of Mathematics" that "this coding method (mainly for five strokes) is effective in theory, but not practical in practice. There are two reasons. First, it is difficult to learn. Secondly, from the perspective of cognitive science, it is impossible for people to do two things at the same time, and it is also impossible to recall the complex code of each word without interrupting their thinking without writing a manuscript. In the past, we did a lot of user tests when studying language recognition, and found that the speed of people who use various complex coding input methods is only half to a quarter of that when they are typing. Therefore, although the average number of keystrokes per word is small, the speed of typing on the keyboard is much slower, which is not fast in general. This is why the simple input method based on pinyin is dominant. In fact, the average length of Chinese spelling is 2.98. As long as the pinyin-based input method can completely solve the problem of multiple words in one sound by using the context, then the average number of keystrokes per Chinese character should be about three times, and it is entirely possible to input 100 words per minute. 」
However, double spelling seems to have solved the problem of "uninterrupted thinking" But Wu Jun also pointed out another problem: "This theoretical value is calculated according to a large number of language models. In the product, it is impossible for us to occupy too much memory space for users, so various input methods provide users with very compressed speech models, while some input methods have no language models at all in order to reduce memory occupation. The key to the quality of pinyin input method lies in an accurate and effective language model. In the past, in order to do pinyin well, many input methods mistakenly added a lot of impractical things (such as the whole sentence of Tang poetry) to the input method vocabulary, which made the input method bloated, slow and annoying. But this is not the case at all now. At present, the mainstream Chinese pinyin input methods have learned a large number of corpora from the Internet, and can also be connected with the Internet to update their learning in real time. Even the input of a word or sentence has realized the recommendation from the cloud. In this case, when you input directly in Pinyin, even if the input is not completely completed, you can infer the result you want with great probability.