Tencent HunYuan Turbo S

Tencent Hunyuan Turbo S Model Officially Launched

On the afternoon of February 27, Tencent officially unveiled its self-developed fast-thinking model, Hunyuan Turbo S. The major highlight of this upgrade is an improved AI interaction experience. The word output speed has doubled, and the initial word delay has been reduced by 44%. This means that as soon as the user finishes asking a question, the model is already ready with its response.

HunYuan Turbo S

Turbo S: The Next-Gen Fast-Response AI Model

Unlike slower models such as DeepSeek-R1 and Hunyuan T1, which require some “thinking time” before responding, Turbo S delivers near-instantaneous replies. With its doubled word output speed and 44% reduction in initial response delay, the model also excels in areas such as knowledge, mathematics, and creative tasks. Furthermore, innovations in model architecture have significantly reduced deployment costs, making large-scale AI adoption more accessible.

Tencent emphasized that “Turbo S, the fast-response AI model, is coming to Yuanbao.” Explaining the importance of instant responses, the company cited research indicating that 90%–95% of human daily decisions rely on intuition. Fast-thinking AI mimics human intuition, providing quick and efficient responses in general scenarios, while slower-thinking AI resembles rational thought, offering in-depth logical analysis. The combination of both enhances intelligence and efficiency in problem-solving.

Competitive Benchmark Performance

According to industry-standard benchmark evaluations, Hunyuan Turbo S has demonstrated performance on par with leading AI models such as DeepSeek-V3, GPT-4o, and Claude 3.5 across various fields, including knowledge, mathematics, and logical reasoning.

Category	Metric	Hunyuan-TurboS	GPT-4o-0806	Claude-3.5 Sonnet-1022	Llama3.1-405B	DeepSeek V3
Knowledge	MMLU	89.5	88.7	88.3	88.6	88.5
	MMLU-pro	79.0	74.9	78.0	73.3	75.9
	GPQA-diamond	57.5	53.1	65.0	51.1	59.1
	SimpleQA	22.8	38.2	28.4	17.1	24.9
	Chinese-SimpleQA	70.8	59.3	51.3	50.4	68.0
Reasoning	BBH	92.2	91.7	92.6	89.2	92.3
	DROP	91.5	79.8	88.3	91.2	91.6
	ZebraLogic	46.0	31.7	35.1	30.1	38.5
Math	MATH	89.7	75.9	78.3	73.8	87.8
	AIME2024	43.3	23.3	16.0	23.3	39.2
Code	HumanEval	91.0	90.0	95.0	89.0	89.0
	LiveCodeBench	32.0	35.1	38.7	30.2	37.6
Chinese	C-Eval	90.9	76.0	80.0	72.7	86.5
	CMMLU	90.8	77.3	81.2	75.4	83.5
Alignment	LiveBench	61.0	56.0	60.3	53.2	60.5
	ArenaHard	88.6	74.9	85.2	69.3	85.5
	IF-Eval	88.6	85.7	89.3	86.0	86.1

This table presents a comparison of the Hunyuan-TurboS model against GPT-4o-0806, Claude-3.5 Sonnet-1022, Llama3.1-405B, and DeepSeek V3 across multiple performance benchmarks. Bold values indicate the highest score in each row.

Availability & Pricing

Turbo S will be gradually rolled out in Tencent Yuanbao, with full availability expected soon. Currently, developers and enterprise users can access the model via API calls on Tencent Cloud, with a one-week free trial available immediately.

Pricing:

Input cost: ¥0.8 per million tokens
Output cost: ¥2 per million tokens

This represents a significant price reduction compared to the previous Hunyuan Turbo model, making it more affordable for users.

Three Key Upgrades: The Core of Future AI Models

The Turbo S model introduces three major advancements:

Innovative Model Architecture
Turbo S integrates a Hybrid-Mamba-Transformer fusion model, reducing the computational complexity of traditional Transformer structures while lowering KV-Cache memory usage. This innovation results in lower training and inference costs.Tencent engineers highlighted that this fusion model solves the challenge of high computational costs in large-scale Transformer models when processing long texts. By leveraging Mamba’s efficiency in handling long sequences while maintaining Transformer’s strength in capturing complex contexts, Tencent successfully implemented Mamba architecture into a massive Mixture of Experts (MoE) model—an industry first.
Overall Performance Improvement
Through a combination of fast and slow thinking processes, Turbo S enhances rapid response in humanities-related queries while significantly improving mathematical reasoning capabilities.
Lower Deployment Costs
The model’s new architecture reduces traditional Transformer computational overhead and KV-Cache memory consumption, significantly lowering training and inference costs.

Tencent has positioned Turbo S as the foundational model for future Hunyuan AI derivatives, supporting advanced reasoning, long-form text processing, and code generation.

HunYuan Turbo S Tencent

The Future: AI Reasoning Model T1 & API Expansion

Building on Turbo S, Tencent has developed the T1 reasoning model, incorporating long thought chains, retrieval-augmented generation (RAG), and reinforcement learning techniques. The T1 model is already available on Tencent Yuanbao, where users can choose between DeepSeek-R1 and Tencent Hunyuan T1 for AI-generated answers.

Tencent has also announced that the official API for Hunyuan T1 will soon be available, providing external users with access to its reasoning capabilities.

Share Tweet