An AI researcher's take on Sora as it sparks debate in China

English Español Français العربية Русский

RSS Newsletters

Radio TV

LANGUAGE
English Español Français العربية Русский Documentary CCTV+

Radio TV

By continuing to browse our site you agree to our use of cookies, revised Privacy Policy and Terms of Use. You can change your cookie settings through your browser.

I agree

A still from a video generated by Sora: a close-up of a short fluffy monster kneeling beside a melting red candle. /OpenAI

Many got sleepless nights after Sora amazed the world with its unprecedented ability of creating videos directly from text instructions. Discussions about what the artificial intelligence model can do and make a difference continue.

Some said it could give a huge blow to traditional industries such as film and television making, looking forward to the day when a movie can be created right after a novel is put into the model. But others remain skeptical about how powerful the model can be in changing the landscape of AI application.

Developed by a group of young talent from Microsoft-backed company OpenAI, the text-to-video model can generate videos up to a minute long while maintaining visual quality and adherence to the user's prompt.

Industry leaders in China said the new model shows an impressive ability in understanding and simulating the physical world in motion, outperforming previous models that work based on AI's understanding of a 2D world.

"Sora represents a revolutionary leap in the field of AI-generated content (AIGC)," said Professor Shen Yang at the School of Journalism and Communication, Tsinghua University.

A milestone

As one of the leading scholars in AI research in China, Shen leads a team that studies the philosophy of AI, such as how humans and machines cooperate and interact with each other, and the application of AI in various fields.

Until he learned about Sora on February 16, Shen was quite satisfied with his team's AI-generated videos. A two-minute video on the Spring Festival produced by Shen's team have recently won many likes on social media platforms. It was made by two team members in five days using a series of AIGC tools, including models that generate texts, images and videos.

"Compared with the new model Sora, what we used are tools of the previous generation. There's a huge gap in between," said Shen.

Shen believes Sora is another milestone in AIGC era, after AI chatbot GPT-3.5 stunned the world in 2022. "I didn't expect the second milestone to come so soon. Sora beats all its peers in terms of spatial sense and accuracy," said Shen.

The professor added that the improvement of Sora's accuracy is based on the long-term accumulation of technological advancement in OpenAI.

"With its huge computing power and capability of optimizing algorithms, this company will achieve the next milestone in the automated generation of 3D spatial videos," said Shen.

A still from the AI-generated video made by Shen Yang's team. /Courtesy of Shen Yang

Changes ahead

As a frequent user of AI, Shen said the technology not only helps improve his productivity, but also benefits his daily life. His wife was suffering from cancer and many complications, and he used AI to assist in finding treatment, which has remarkably prolonged her life. He even wrote an award-winning science fiction novel using AI.

However, new technologies do not mean good news to everyone. Many also concern about AI models' safety issues since related regulations are lagging behind.

"Text-generation models help users increase their productivity," said Shen. "But the demand for art designers has decreased significantly as image generation models become available in the market. As AI models evolve, similar changes will happen for professions such as language translators and low level programmers."

Sora is going to bring changes in many fields, including short video, film and television, news, games, advertising, education, and even industrial manufacturing, according to Shen. Others also believe Sora has great potential in pushing the development of autonomous driving.

A scene from a video produced by Sora model is being displayed on a smartphone with the OpenAI logo visible in the background, Brussels, Belgium, February 16, 2024. /CFP

There is still much room to improve AI models. For instance, current AI models are not capable of drawing characters accurately and quickly. Shop signs lack meaning in Sora's demo video showing a woman walking down a street in Tokyo. But these problems are believed to be solved as models iterate.

The United States is leading the world in AI development. With the country's latest breakthroughs, it seems other countries are finding it difficult to keep up as AI requires huge amounts of investment and solid foundation of technological innovations in hardware and software. OpenAI CEO Sam Altman pursues to reshape the global semiconductor industry and AI with trillions of dollars in investment, The Wall Street Journal reported.

By the end of 2023, China launched more than 200 AI models, of which more than 20 have been approved to provide services to the public, according to reports of China Media Group. It is hoped that these tools could enable intelligent industrial production. Meanwhile, more than 30 cities are building or proposing to build intelligent computing centers to prepare for the coming AI era.

Shen pointed out that AI is supposed to empower common people through various levels of content generation capability. He predicted that, with the growth of the overall computing power, the total GDP of human beings may skyrocket 10 times after they experience these scientific and technological revolutions represented by AI.

"We are on the eve of a massive transformation," said Shen.

Text-to-video AI showdown: Sora vs. the rest

(Cover designed by Yu Peng, and Cao Qingqing also contributed to the reporting.)