/CFP
Vidu, a large video generation model developed by Chinese AI company ShengShu Technology and Tsinghua University, which features text-to-video and image-to-video generation, recently became available for global use.
Vidu is capable of creating 4-second clips in 30 seconds and can generate videos up to 32 seconds long in a single instance.
"Vidu can simulate the real physical world, creating detailed scenes that adhere to physical laws, such as natural lighting and shadow effects, as well as intricate facial expressions. Additionally, it can generate surrealistic content with depth and complexity," said Zhu Jun, deputy director of the Tsinghua Institute for Artificial Intelligence.
Zhu added that for different genres like sci-fi, romance and animation, Vidu can produce scenes that capture the essence of each style, and it can also create high-quality cinematic effects, such as smoke and lens flares.
The AI model can manage various shot types, including long shots, close-ups and medium shots, and can effortlessly produce effects like long takes, focus pulls and smooth scene transitions.
Users can upload portraits or customized character images and use text descriptions to direct the characters to perform any action in any scene. This feature streamlines the video production process and enhances creative freedom.
The company said Vidu's core architecture was proposed as early as 2022. The AI model was unveiled at the 2024 Zhongguancun Forum in Beijing in April, two months after OpenAI announced its Sora video model. But Vidu has kept a low profile since the forum.
During the months in between, similar tools such as Kuaishou's generative video model Kling and large language model family ChatGLM have been opened to users.
(With input from Xinhua)