Tencent Hunyuan Turbo S Model Officially Launched
On the afternoon of February 27, Tencent officially unveiled its self-developed fast-thinking model, Hunyuan Turbo S. The major highlight of this upgrade is an improved AI interaction experience. The word output speed has doubled, and the initial word delay has been reduced by 44%. This means that as soon as the user finishes asking a question, the model is already ready with its response.

HunYuan Turbo S
Turbo S: The Next-Gen Fast-Response AI Model
Unlike slower models such as DeepSeek-R1 and Hunyuan T1, which require some “thinking time” before responding, Turbo S delivers near-instantaneous replies. With its doubled word output speed and 44% reduction in initial response delay, the model also excels in areas such as knowledge, mathematics, and creative tasks. Furthermore, innovations in model architecture have significantly reduced deployment costs, making large-scale AI adoption more accessible.
Tencent emphasized that “Turbo S, the fast-response AI model, is coming to Yuanbao.” Explaining the importance of instant responses, the company cited research indicating that 90%–95% of human daily decisions rely on intuition. Fast-thinking AI mimics human intuition, providing quick and efficient responses in general scenarios, while slower-thinking AI resembles rational thought, offering in-depth logical analysis. The combination of both enhances intelligence and efficiency in problem-solving.
Competitive Benchmark Performance
According to industry-standard benchmark evaluations, Hunyuan Turbo S has demonstrated performance on par with leading AI models such as DeepSeek-V3, GPT-4o, and Claude 3.5 across various fields, including knowledge, mathematics, and logical reasoning.
Category | Metric | Hunyuan-TurboS | GPT-4o-0806 | Claude-3.5 Sonnet-1022 | Llama3.1-405B | DeepSeek V3 |
---|---|---|---|---|---|---|
Knowledge | MMLU | 89.5 | 88.7 | 88.3 | 88.6 | 88.5 |
MMLU-pro | 79.0 | 74.9 | 78.0 | 73.3 | 75.9 | |
GPQA-diamond | 57.5 | 53.1 | 65.0 | 51.1 | 59.1 | |
SimpleQA | 22.8 | 38.2 | 28.4 | 17.1 | 24.9 | |
Chinese-SimpleQA | 70.8 | 59.3 | 51.3 | 50.4 | 68.0 | |
Reasoning | BBH | 92.2 | 91.7 | 92.6 | 89.2 | 92.3 |
DROP | 91.5 | 79.8 | 88.3 | 91.2 | 91.6 | |
ZebraLogic | 46.0 | 31.7 | 35.1 | 30.1 | 38.5 | |
Math | MATH | 89.7 | 75.9 | 78.3 | 73.8 | 87.8 |
AIME2024 | 43.3 | 23.3 | 16.0 | 23.3 | 39.2 | |
Code | HumanEval | 91.0 | 90.0 | 95.0 | 89.0 | 89.0 |
LiveCodeBench | 32.0 | 35.1 | 38.7 | 30.2 | 37.6 | |
Chinese | C-Eval | 90.9 | 76.0 | 80.0 | 72.7 | 86.5 |
CMMLU | 90.8 | 77.3 | 81.2 | 75.4 | 83.5 | |
Alignment | LiveBench | 61.0 | 56.0 | 60.3 | 53.2 | 60.5 |
ArenaHard | 88.6 | 74.9 | 85.2 | 69.3 | 85.5 | |
IF-Eval | 88.6 | 85.7 | 89.3 | 86.0 | 86.1 |
This table presents a comparison of the Hunyuan-TurboS model against GPT-4o-0806, Claude-3.5 Sonnet-1022, Llama3.1-405B, and DeepSeek V3 across multiple performance benchmarks. Bold values indicate the highest score in each row.
Availability & Pricing
Turbo S will be gradually rolled out in Tencent Yuanbao, with full availability expected soon. Currently, developers and enterprise users can access the model via API calls on Tencent Cloud, with a one-week free trial available immediately.
Pricing:
- Input cost: ¥0.8 per million tokens
- Output cost: ¥2 per million tokens
This represents a significant price reduction compared to the previous Hunyuan Turbo model, making it more affordable for users.
Three Key Upgrades: The Core of Future AI Models
The Turbo S model introduces three major advancements:
- Innovative Model Architecture
Turbo S integrates a Hybrid-Mamba-Transformer fusion model, reducing the computational complexity of traditional Transformer structures while lowering KV-Cache memory usage. This innovation results in lower training and inference costs.Tencent engineers highlighted that this fusion model solves the challenge of high computational costs in large-scale Transformer models when processing long texts. By leveraging Mamba’s efficiency in handling long sequences while maintaining Transformer’s strength in capturing complex contexts, Tencent successfully implemented Mamba architecture into a massive Mixture of Experts (MoE) model—an industry first. - Overall Performance Improvement
Through a combination of fast and slow thinking processes, Turbo S enhances rapid response in humanities-related queries while significantly improving mathematical reasoning capabilities. - Lower Deployment Costs
The model’s new architecture reduces traditional Transformer computational overhead and KV-Cache memory consumption, significantly lowering training and inference costs.
Tencent has positioned Turbo S as the foundational model for future Hunyuan AI derivatives, supporting advanced reasoning, long-form text processing, and code generation.

HunYuan Turbo S Tencent
The Future: AI Reasoning Model T1 & API Expansion
Building on Turbo S, Tencent has developed the T1 reasoning model, incorporating long thought chains, retrieval-augmented generation (RAG), and reinforcement learning techniques. The T1 model is already available on Tencent Yuanbao, where users can choose between DeepSeek-R1 and Tencent Hunyuan T1 for AI-generated answers.
Tencent has also announced that the official API for Hunyuan T1 will soon be available, providing external users with access to its reasoning capabilities.
Reviews
There are no reviews yet.