Chinese AI developer, SenseTime, unveiled its upgraded multimodal SenseNova 5.5 model and claims it represents the state-of-the-art.
The upgraded model comes just a few months after the release of SenseNova 5 which SenseTime says was on par with GPT-4 Turbo.
The upgraded 600B parameter SenseNova 5.5 reportedly represents a 30% improvement in overall performance.
The benchmark scores that the company released show its model beating GPT-4o and Anthropic’s Claude Sonnet 3.5 models.
The benchmarks SenseNova 5.5 excels at are the ones typically used for Chinese models. If they used GPQA, Humaneval, or Math benchmarks we could make a fairer comparison but even so, these figures look impressive.
SenseTime also revealed SenseNova 5o, China’s first real-time multimodal model capable of processing text, images, audio, and video.
The demo of SenseNova 5o interacting onstage showed it performing much like the GPT-4o demo which we’re still waiting to get our hands on.
SenseTime says SenseNova 5o’s interactions are “on par with GPT-4o’s streaming interaction capabilities.”
Claude 3.5/GPT-4oを超える生成AI、SenseNova 5.5が発表されました。
また、マルチモーダルモデル、SenseNova 5oも同時に発表されたようです。SenseNova 5.0と比較してパフォーマンスが30%向上し、数学や英語を中心に多くのコア指標がGPT-4oの標準を上回っているとのこと。pic.twitter.com/H1u98SFVwX
— 江藤圭一|Radineer (@RadineerE10) July 8, 2024
The company also unveiled a “Lite” version of SenseNova 5.5 which is a cloud-to-edge low-cost model intended to run on-device.
SenseTime says its edge-side model will cost as little as RMB 9.90 per year per device but didn’t offer any performance figures.
As part of SenseNova 5.5, SenseTime also released Vimi, a controllable AI avatar video generator.
Vimi can generate videos up to one minute long using a single photo as a prompt. It also allows for precise control over an avatar’s facial expressions and upper body movements.
さらに
・音声
・テキスト
・画像
・動画
を処理できるリアルタイム・マルチモーダルモデル、SenseNova 5oも公開pic.twitter.com/CKs0JyaH1m— あるる ChatGPT × AIツール (@chatgptair) July 9, 2024
OpenAI’s Chinese exit
In line with US sanctions on tech exports to China, OpenAI will block API access to its tools and services for users in China.
The Chinese government already blocks ChatGPT but users there have been able to get around the government’s firewall using VPNs. OpenAI hasn’t fully explained why but it will be blocking this workaround as of today.
This has caused a mad scramble as Chinese companies look for alternatives to OpenAI’s models. SenseTime announced the launch of its “Project $0 Go” scheme to woo users to its platform.
The scheme is a free and comprehensive onboarding bundle to help new enterprise users migrate from OpenAI’s platforms to SenseTime. It includes a credit of 50 million tokens and API migration consulting services.
Other Chinese model suppliers have also been trying to cash in on OpenAI’s exit. Baidu, Zhipu, and Tencent have all offered between 50m and 150m tokens as incentives to migrate to their platforms.
Ironically, the tightened US sanctions and OpenAI’s exit from China will likely drive homegrown AI advancements as Chinese companies capitalize on income that, until now, would go to the US.
SenseTime’s SenseNova and Alibaba’s Tongyi Qianwen models are seeing a surge in downloads and customer engagement.
As Chinese developers make their multimodal features publicly available you have to wonder how patient American users will continue to be.
Will they wait for OpenAI and Google to progress from demo to product or will we see American users adopt Chinese models.