OpenAI Sora: How is it and where will it go? - CGTN

English Español Français العربية Русский

RSS Newsletters

SIGN IN USER

Radio TV

LANGUAGE
English Español Français العربية Русский Documentary CCTV+

SIGN IN USER

Radio TV

By continuing to browse our site you agree to our use of cookies, revised Privacy Policy and Terms of Use. You can change your cookie settings through your browser.

I agree

Translating...

Content is automatically generated by Microsoft Azure Translator Text API. CGTN is not responsible for any of the translations.

Error loading player: No playable sources found

02:34

OpenAI has stunned the world with Sora, its newest AI innovation that generates realistic videos from simple text descriptions.

For those unfamiliar with the wonders of AI-generated content, it's important to clarify that Sora doesn't simply stitch together multiple images. It creates dynamic video sequences with several key advantages over existing models.

Unlike other models limited to seconds, Sora generates videos up to a minute long. It goes beyond static shots with pan shots, close-ups and wide shots. What's more, objects and backgrounds maintain consistency throughout the video, avoiding jarring inconsistencies like hands with fluctuating numbers of fingers. This surpasses the capabilities of many community-driven projects.

Despite these impressive feats, Sora isn't flawless. While the generated environments look real, text elements like shop signs often lack meaning. They display nonsensical characters instead of accurate language. The first demo video on Sora's website – with a woman walking down a street – is a clear example of that.

Though adept at detail, Sora can still make mistakes. The crowd's feet in the street video appeared deformed.

However, these hiccups shouldn't overshadow Sora's potential. Models like this lay the groundwork for real-time video generation. Imagine computers creating video based on live input, revolutionizing fields like video games and entertainment.

To achieve this dream, significant computing power is required. Generating a second of video requires at least a dozen frames, but current text-to-image models take seconds to process just one frame with the best consumer PC hardware. This could translate to a tenfold increase in computing needs, creating a vast new market for hardware providers.

In conclusion, text-to-video models like Sora have crossed a critical threshold, becoming truly usable with exciting potential. Despite facing technical and moral hurdles, they're poised to propel the already booming AI market to new heights.

(Cover via CFP.)

TOP NEWS

Xi: China has been and will remain a promising investment destination

NaN

Xi: China has been and will remain a promising investment destination

China

03:31, 28-Mar-2025

Live updates: M7.7 quake strikes Myanmar

NaN

Live updates: M7.7 quake strikes Myanmar

07:24, 28-Mar-2025

White paper highlights human rights progress in Xizang

NaN

White paper highlights human rights progress in Xizang

China

09:48, 28-Mar-2025

RELATED STORIES

MORE FROM CGTN

Don't miss the amazing robots at Zhongguancun Forum

1

Don't miss the amazing robots at Zhongguancun Forum

Drug-resistant TB control, new tech key in fight against the disease

2

Drug-resistant TB control, new tech key in fight against the disease

Health

07:17, 28-Mar-2025

Zhongguancun Forum unveils top tech breakthroughs

3

Zhongguancun Forum unveils top tech breakthroughs

China

00:25, 28-Mar-2025

Foreign Ministry: China committed to intl cooperation in innovation

4

Foreign Ministry: China committed to intl cooperation in innovation

China

13:58, 27-Mar-2025

Open in CGTN APP for better experience

Search Trends

SITEMAP

EXPLORE MORE

DOWNLOAD OUR APP

Copyright © 2024 CGTN. 京ICP备20000184号

京公网安备 11010502050052号

Disinformation report hotline: 010-85061466

SITEMAP

EXPLORE MORE

DOWNLOAD OUR APP

Copyright © 2024 CGTN. 京ICP备20000184号

京公网安备 11010502050052号

Disinformation report hotline: 010-85061466