English Español Français العربية Русский

RSS Newsletters

Radio TV

LANGUAGE
English Español Français العربية Русский Documentary CCTV+

Radio TV

By continuing to browse our site you agree to our use of cookies, revised Privacy Policy and Terms of Use. You can change your cookie settings through your browser.

I agree

00:48

Alphabet on Wednesday introduced its most advanced artificial intelligence (AI) model, a technology capable of crunching different forms of information such as video, audio and text.

Called Gemini, the Google owner's highly anticipated AI model is capable of more sophisticated reasoning and understanding information with a greater degree of nuance than Google's prior technology, the company said.

"This new era of models represents one of the biggest science and engineering efforts we've undertaken as a company," Alphabet CEO Sundar Pichai wrote in a blog post.

Since the launch of OpenAI's ChatGPT roughly a year ago, Google has been racing to produce AI software that rivals what the Microsoft-backed company has introduced.

The AI model will include three different sizes, namely Gemini Ultra, its most capable and largest model for highly-complex tasks; Gemini Pro, its best model for scaling across a wide range of tasks; and Gemini Nano, its most efficient model for on-device tasks, according to Google.

Gemini Ultra is the first model to outperform human experts on MMLU (Massive Multitask Language Understanding) which combines 57 subjects such as mathematics, physics, history, law, medicine and ethics to test world knowledge and problem-solving skills, said the company.

The version also outperformed GPT-4 on multiple-choice exams, grade-school math and other benchmarks, according to a white paper released by Google on Wednesday.

Screenshot via Google DeepMind shows Gemini Ultra outperforms GPT-4 in certain benchmarks.

"Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research," Demis Hassabis, CEO and Co-Founder of Google DeepMind, the AI division behind Gemini, wrote in a blog post.

"It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video," said Hassabis.

With Gemini providing a helping hand, Google promises Bard will become more intuitive and better at tasks that involve planning. On the Pixel 8 Pro, Gemini will be able to quickly summarize recordings made on the device and provide automatic replies on messaging services, starting with WhatsApp, the company said.

A demonstration of Gemini showed that Google's "Bard Advanced" might be capable of unprecedented AI multitasking by simultaneously recognizing and understanding presentations involving text, photos and video.

Gemini will also eventually be infused into Google's dominant search engine, although the timing of that transition hasn't been spelled out yet.

"This is a significant milestone in the development of AI, and the start of a new era for us at Google," declared Hassabis.

The AI model will at first only work in English throughout the world, but Google executives said that the technology will have no problem eventually diversifying into other languages.

(With input from Reuters, AP)