Tech & Sci
2024.10.02 11:58 GMT+8

Microsoft enhances AI with Copilot updates, boosting voice, visual capabilities

Updated 2024.10.02 11:58 GMT+8
CGTN

Microsoft headquarters in Redmond, Washington, D.C., U.S., July 3, 2024. /CFP

Microsoft announced an updated version of its Copilot chatbot on Tuesday, introducing new voice and visual features designed to enhance the user experience.

The updated Copilot can now engage in voice conversations and interpret images. The new feature enhances user interaction, offering four voice options for brainstorming, quick inquiries and emotional companionship.

Mustafa Suleyman, Microsoft's executive vice president and CEO of AI, described Copilot as "in your corner, by your side," aiming to provide a seamless and intuitive AI experience.

To avoid criticisms faced by OpenAI over a chatbot voice resembling that of actress Scarlett Johansson, Microsoft used voice actors to provide training data for the four voice options, ensuring they don't mimic well-known figures.

The company is also testing visual capabilities, allowing users to "see" content alongside the AI on a webpage and receive relevant suggestions without disrupting workflow. Microsoft has assured that data from the visual feature will be discarded after use and restricted to select websites for safety.

Additionally, Microsoft introduced the "Think Deeper" feature, which enables Copilot to handle more complex queries and reasoning, following in the steps of OpenAI's recently upgraded model for tackling scientific, coding and mathematical problems. Also, the "Discover" feature will make Copilot more personalized based on user interactions, although this enhancement will not yet be available in the EU or Britain due to stricter data protection regulations.

A smartphone displays OpenAI logo with the Microsoft logo visible in the background. /CFP

Microsoft has leveraged its $13 billion partnership with OpenAI to introduce generative AI technology to everyday users. The tech giant now faces increased competition from major players like Google, Apple and Meta, who are integrating AI across popular platforms to reach a broader consumer base.

OpenAI, meanwhile, introduced new tools for developers on Tuesday to simplify the creation of AI applications. One of the highlights is a real-time tool that allows developers to build AI voice applications using a single set of instructions. This streamlines a process that previously required at least three steps: transcribing audio, generating a response and converting text back to speech.

As part of the rollout, OpenAI also unveiled a fine-tuning tool to improve AI models after training. This tool allows developers to refine model responses using images and text, incorporating feedback from humans to better identify strong and weak answers. By fine-tuning with images, models gain enhanced image recognition capabilities, which can be applied to visual search improvements and object detection for autonomous vehicles.

(With input from agencies)

Copyright © 

RELATED STORIES