Following the release of ERNIE Bot by Baidu on March 16, on April 7th, Yi Tong Qian Wen, a big model officially released by Alibaba Cloud, invited the test to begin.
On April 8, Tian Qi, the chief scientist in Huawei's cloud artificial intelligence field, shared the progress and application of Pangu Big Model at the artificial intelligence big model technology summit forum. He revealed that Huawei's Pangu model is promoting the development of artificial intelligence from "workshop-style" to "industrialization".
Then a number of large-scale model-related meetings were held together.
On April 10, Shang Tang's "SenseNova" large model system came out; In April 1 1, DriveGPT Snow Fox, a large self-driving model of millimeter Zhixing, was released in Hai Ruo; The Tiangong Model 3.5 jointly developed by Kunlun Wan Wei and Singularity Zhiyuan will be released soon, and the invitation test will be started on April 17. On may 6th, Iflytek's "1+N cognitive intelligence model" will be released soon. ...
Internet giants, artificial intelligence companies, intelligent hardware companies, autonomous driving companies and other forces have actively participated in the feast of this big model.
Industry supervision has also acted quickly. On April 1 1, the National Network Information Office issued a notice to solicit public opinions on the Administrative Measures for Generative Artificial Intelligence Services (Draft for Comment).
According to Wu Jun, a computer scientist and natural language model expert, behind ChatGPT is a mathematical model called language model. This language model technology existed as early as 1972 and was developed by his mentor, Flederick Jelinek, while working at IBM.
Only today, due to the continuous improvement of computing power, the language model has developed from the initial model based on probability prediction to the pre-training language model based on Transformer architecture, and gradually moved towards the era of large models.
Qiu Xipeng, a professor at the School of Computer Science of Fudan University and the head of MOSS system, once described the capability leap of the large model as follows: "When the scale of the model is small, the performance and parameters of the model roughly conform to the law of proportionality, that is, the performance improvement and parameter growth of the model are basically linear. However, when GPT-3/ChatGPT, a large-scale model of 100 billion levels, was put forward, it was found that it could break the law of proportionality and achieve a qualitative leap in model capability. These abilities are also called' emergent abilities' of large models (such as understanding human instructions, etc.). ). "
Whenever a revolutionary technology is born, its commercial application in a specific industry will substantially promote the progress of the industry. As a bridge between technology ecology and business ecology, large-scale models will also be applied in many industries.
However, this wave of fierce big model fever will promote the growth of everything and change thousands of formats, or is it another beautiful bubble after blockchain and meta-universe?
Infinite imagination? Tesla introduced the Transformer big model into the field of autonomous driving, which opened the beginning of the application of AI big model in the field of autonomous driving. The application of large-scale model in the autonomous driving industry will improve the perception and decision-making ability of the system, which has been regarded as the core driving force to improve the autonomous driving ability.
On April 2nd, Baidu officially released Baidu's autopilot cloud Apollo Cloud2.0. Gao Guorong, vice president of Baidu's intelligent driving business group and general manager of intelligent networking business, said that Apollo Cloud2.0 has realized an intelligent search engine for autopilot data based on a large model, and the ability of the large model has accumulated an intelligent search engine for autopilot data, which can accurately find autopilot data for different scenarios from massive data.
"In the field of autonomous driving, BEV (bird's eye view) is the mainstream technical route at present, and it can develop in the direction of multi-modal and general intelligence in the future." Wang Xiaogang, co-founder and chief scientist of Shangtang Technology and president of Jueying Smart Car Group, said.
He believes that in the era of general artificial intelligence, multimodal data can be generated by inputting prompt words and multimodal content, and more importantly, task descriptions can be generated by natural language, covering a large number of long tail problems and open tasks in a very flexible way, even some subjective descriptions.
Wang Xiaogang gave an example to illustrate the difference between artificial intelligence and AGI in processing tasks. Given a picture, what is the difference between AI and AGI in judging whether it is necessary to slow down?
The existing AI system will do object detection first, then do text recognition in the object box, and finally make a decision. Every module in the whole process is a predefined task.
Under general artificial intelligence, given an image, people only need to ask questions in natural language, such as "What does this icon mean? What should we do? " The model itself will not change, it will give a series of logical reasoning through natural language and finally draw a conclusion. For example, it will say "the speed limit ahead is 30km/h", 100m ahead is the school area ","there are children ","drive carefully "and" the speed drops below 30km/h ".
Wang Xiaogang also pointed out that there is a "data flywheel" in the field of intelligent driving cars, and a "smart flywheel" will be produced in the era of general artificial intelligence. People and models can communicate with each other. Through people's feedback, the model can better understand what abilities people need it to show and unlock more skills. Upgrading from data flywheel to intelligent flywheel can realize human-computer intelligence.
Shang Tang is based on multi-modal large model, which can realize the closed loop of data perception and decision-making. From the front-end automatic collection of high-quality data to the use of large models for automatic data labeling and product inspection, "the efficiency of model iteration can be improved by hundreds of times and the cost can be reduced".
You Peng, president of Huawei Cloud ei Service Products Department, also said that "the whole data annotation is the part with the highest precision, efficiency and cost in the whole autonomous driving field", and the efficiency of this part directly affects the improvement of autonomous driving algorithm and driving level. He revealed that Huawei Cloud is doing pre-training, marking large models and supporting the follow-up training of autonomous driving algorithms, which may be released in a few months.
In addition to autonomous driving, many people in the industry believe that the intelligent cockpit will be qualitatively improved under the empowerment of large models, especially in human-computer interaction.
Li Zhenyu, senior vice president of Baidu Group and president of intelligent driving business group, believes that artificial intelligence will reshape the car space, and the relationship between people and cars will be completely different. "In the future, we believe that every car will carry a digital virtual person. In the future, digital virtual people can not only simulate the appearance of human beings, but also inject souls and truly understand the intentions of human beings ... At the same time, they will no longer be the identity of a single car assistant in the previous scene, but will be transformed into an all-round assistant. "
He believes that with the development of general artificial intelligence, the intelligent cockpit will become the new focus of automobile innovation and will reshape its space. By then, the distance between users and car companies will be shortened, and the relationship between users and brands will be closer. "Smart cars with natural language communication skills can allow car companies to have one-on-one conversations with users directly. When a car becomes an all-around assistant, car companies will face explosive growth in user demand. "
Wang Xiaogang said that in the intelligent cockpit, general artificial intelligence can enable the basic model to have a series of abilities, such as understanding the spatial environment, perceiving the user's state, analyzing multimodal instructions, generating multiple rounds of logical dialogues and contents, and then endow it with a series of functions, such as emotional perception, intelligent assistant, emotion-based dialogue, creative content generation, personalized interaction and so on, so as to continuously improve the personalized experience and further expand the application scenarios.
"Smart cars are a very good scene for general artificial intelligence to achieve closed loop. We have been driven by human machines." Wang Xiaogang said, "In the future, we hope that there will be more effective interaction between cars and models, and then complete an interactive closed loop from people to cars to models, so that general artificial intelligence can provide us with a better driving experience and unlock unlimited imagination."
However, no one can tell how far consumers are from this "unlimited imagination" car life.
Hope is that the imagination of the future is beautiful, but challenges also follow.
"In the past, we had to manually calibrate about 6.5438+million self-driving images a year, so please outsource the calibration. It probably took 6 yuan to 8 yuan, and the annual cost is close to 1 100 million. However, when we use the large model of software 2.0 for automatic calibration through training, the effect will be terrible-what was completed in the previous year can be completed in three hours, and the efficiency is 1000 times that of people. " Li Xiangguo, founder, chairman and CEO of Li Company, said: "For employees, they will feel that they have a gun in their hands when fighting with their fists."
He believes that in this case, how to integrate software 2.0 with existing talents, provide them with new workflow and incentive mechanism, and how to select and appoint talents poses challenges to the whole industry.
The bigger challenge may lie in the gap between Chinese and foreign large model technologies.
On March 25th, at the 2023 annual meeting of China Development Forum, Zhou, the founder, chairman and CEO of 360, said that the gap between China's big language model and 4 is two to three years, and its technical direction has been made clear, and there are no insurmountable technical obstacles. China has great advantages in scene, engineering, productization and commercialization, so it should adhere to the spirit of Taoism and catch up.
On April 9th, at the artificial intelligence big model technology summit forum hosted by China Artificial Intelligence Society, Li Changliang, CTO of Jin Xin, Rong Hui, thought that the stratification of future general big models and scenes was very clear, and there was no intermediate state. Need a lot of computing power, data, personnel, resources, etc. Make a rough model. Only large companies with strong technical reserves and resource allocation capabilities can do it, and it will be difficult for small and medium-sized startups on this track. In the vertical application, based on the development of large models, combined with the know-how of the scene to do some innovative applications, there will be countless enterprises born.
He also thinks that China has a good chance in the track of big model industry, because in the scene of China, we know our own language better and have accumulated a lot of Chinese knowledge, so we will catch up and surpass it soon.
We also noticed that Wu Jun, a computer scientist and natural language model expert, poured cold water on the current ChatGPT craze during the live broadcast on the evening of April 3rd. He bluntly said that ChatGPT was over-hyped in China, and most domestic research institutions could not do it.
In his view, the principle of ChatGPT is very simple, but it is actually quite difficult to implement in engineering, because ChatGPT consumes too many resources, and the cost of optical hardware is almost $65.438+0 billion, not counting electricity bills. How much electricity does ChatGPT need for a training session? According to Wu Jun, there are about 3,000 Tesla electric cars, each of which ran to 200,000 miles and died. Such a large power consumption is enough for a training. This is a very expensive thing.
He concluded that ChatGPT is not a new technological revolution, nor will it bring any new opportunities. The final possible outcome is to pay for several large cloud computing companies.
Will the big model craze brought by ChatGPT eventually blossom and bear fruit in all walks of life, or is it really difficult to become famous? Might as well leave this question to time.
This article is reproduced by auto business review original products or contents. Please contact us to explain that illegal reprinting will be prosecuted. This article comes from auto business review and is written by Che Yi. The copyright belongs to the author. Please contact the author for any form of reprint. The content only represents the author's point of view and has nothing to do with the car reform.