Bot mania: How tech giants' chatbots fare in early 2024

English Español Français العربية Русский

RSS Newsletters

Radio TV

LANGUAGE
English Español Français العربية Русский Documentary CCTV+

Radio TV

By continuing to browse our site you agree to our use of cookies, revised Privacy Policy and Terms of Use. You can change your cookie settings through your browser.

I agree

Translating...

Content is automatically generated by Microsoft Azure Translator Text API. CGTN is not responsible for any of the translations.

/CFP

Forget cat videos – large language model-powered chatbots have been the online playground for a year now. We dabbled with the bots, integrated them into workflows and even witnessed their misuse in misinformation campaigns. But how have tech giants' bots evolved? We put some top names in tech to the test to find out.

Math meltdown

These language models learn by devouring mountains of text, mimicking human speech and writing. But this feast can lead to indigestion: gibberish, fabrications and even plausible lies. A simple middle school math problem is the perfect litmus test.

We challenged Bing Chat, Google Bard and local Chinese heroes like the just-released iFlytek SparkDesk, Baidu's Ernie Bot, Alibaba's Tongyi Qianwen and Tencent's HunYuan. Some, like Bing Chat and SparkDesk, displayed formulas like seasoned mathematicians. However, others clung to an outdated "^" notation for exponentiation. And the results? Bing Chat hit only one of two correct answers. Ernie snagged the other. Then there was Bard, trailing behind with two wrong responses and a process more cryptic than a tax code.

Date dilemma

These chatbots were trained based on happenings in a specific period. SparkDesk V3.5 only launched days ago, while Bard hasn't had a makeover since December 2023. They're understandably clueless about events after their training period. But some seemed confused even about the present.

Doubao and Bing Chat stood firm when repeatedly questioned about the current date. Others, like SparkDesk, pleaded ignorance. Hunyuan time-warped back to August 2023, close to its own upgrade date. And Bard insisted we'd already mentioned the date, which is news to us. Ernie and Tongyi Qianwen? They just swallowed our wrong suggestions and called it a day.

The verdict

Our tests paint a clear picture: freely available AI models still have a long way to go. They might generate stunning artwork or edit your prose to perfection, but basic facts remain a stumbling block. So, readers, remember: AI chatbots are language magicians, not fact-checkers. Approach their pronouncements with a healthy dose of skepticism.