Gemma 4 26B A4B
26 billion parameters, 4 billion active - frontier intelligence at inference speed
Gemma 4 26B A4B is a Mixture-of-Experts model that activates only 4B parameters per token while delivering near-31B quality. With 256K context, 140+ languages, and 88.3% on AIME 2026, it's the most efficient path to frontier-class reasoning.
Model variants
Instruction-tuned and base models
Choose between the instruction-tuned variant optimized for chat and task completion, or the base model for fine-tuning and specialized applications.
Mixture-of-Experts Architecture
25.2B total parameters, 3.8B active per token
Gemma 4 26B A4B uses a sparse MoE design with 8 active experts out of 128 total, plus 1 shared expert. All 26B parameters load into memory for fast routing, but inference cost stays near a 4B dense model.
Ideal for high-throughput production deployments where you need near-31B quality at a fraction of the compute cost.
Instruction-tuned
26B Instruct
Optimized for conversational AI and complex task completion
Fine-tuned with RLHF for following instructions and multi-turn dialogue
Pre-trained
26B Base
Foundation MoE model for fine-tuning and specialized applications
Pre-trained on diverse multimodal data with sparse expert routing
Capabilities
Frontier-level performance at 4B inference cost
Gemma 4 26B A4B combines MoE efficiency with advanced reasoning, exceptional coding, and multimodal understanding - delivering near-31B quality at a fraction of the compute.
MoE efficiency
Activates only 3.8B parameters per token from a 25.2B pool. Near-31B quality at ~4B inference cost - the best efficiency ratio in the Gemma 4 family.
Advanced reasoning
Configurable thinking mode enables step-by-step reasoning. Achieves 88.3% on AIME 2026 mathematics - just 0.9 points behind the 31B dense model.
Exceptional coding
77.1% on LiveCodeBench v6 and 1718 Codeforces ELO. Native function calling for agentic workflows and autonomous code execution.
256K context window
Extended context for entire codebases, long documents, and multi-turn conversations. Hybrid local/global attention for memory efficiency.
Multimodal understanding
Processes text and images with variable aspect ratios. 73.8% on MMMU Pro and 82.4% on MATH-Vision for visual reasoning.
140+ languages
Multilingual support with cultural context understanding. 82.6% on MMLU Pro across diverse knowledge domains.
Key highlights
Exceptional performance metrics
Gemma 4 26B A4B achieves near-31B results across diverse benchmarks while activating only 3.8B parameters per token.
Top achievements
- Arena AI ELO 1441 - competitive with the 31B dense model
- 88.3% on AIME 2026 mathematics (no tools)
- 77.1% on LiveCodeBench v6 coding
- 82.3% on GPQA Diamond scientific knowledge
- 85.5% on t2-bench agentic tool use
Technical specs
- 25.2B total parameters, 3.8B active per token
- 8 active + 1 shared expert out of 128 total
- 256K token context window
- Support for 140+ languages
- Hybrid local/global attention mechanism
Performance
Near-31B quality at 4B inference cost
Gemma 4 26B A4B achieves 88.3% on AIME 2026 and 82.6% on MMLU Pro - within 1% of the 31B dense model - while activating only 3.8B parameters per token.
Gemma 4 26B A4B demonstrates consistent excellence across reasoning, coding, multimodal, and agentic benchmarks - within 1-3% of the 31B dense model on every task.
Arena AI ELO 1441 - competitive with the 31B dense model
88.3% on AIME 2026 mathematics (no tools)
77.1% on LiveCodeBench v6 competitive coding
82.3% on GPQA Diamond scientific knowledge
85.5% on t2-bench agentic tool use
Benchmark comparison
26B MoE vs 31B Dense and the Gemma 4 family
Gemma 4 26B A4B delivers near-31B performance across reasoning, coding, multimodal, and agentic tasks at a fraction of the inference cost.
| Benchmark | Gemma 4 26B A4B IT Thinking Featured | Gemma 4 31B IT Thinking | Gemma 4 E4B IT Thinking | Gemma 3 27B IT |
|---|---|---|---|---|
Arena AI (text) As of April 2, 2026 | 1441 | 1452 | - | 1365 |
MMLU Pro Knowledge & reasoning No tools | 82.6% | 85.2% | 69.4% | 67.6% |
MMMU Pro Multimodal reasoning | 73.8% | 76.9% | 52.6% | 49.7% |
AIME 2026 Mathematics No tools | 88.3% | 89.2% | 42.5% | 20.8% |
LiveCodeBench v6 Competitive coding | 77.1% | 80.0% | 52.0% | 29.1% |
GPQA Diamond Scientific knowledge No tools | 82.3% | 84.3% | 58.6% | 42.4% |
t2-bench Agentic tool use Retail | 85.5% | 86.4% | 57.5% | 6.6% |
Benchmark results from official Gemma 4 model card. Arena AI scores as of April 2, 2026.
MoE Architecture
26B capacity, 4B inference cost
The Mixture-of-Experts design routes each token through 8 of 128 experts plus 1 shared expert. All 26B parameters stay in memory for instant routing, but only 3.8B activate per forward pass - delivering near-31B quality at a fraction of the compute.
- 3.8B active parameters per token from 25.2B total capacity
- 8 active + 1 shared expert out of 128 total experts
- Proportional RoPE (p-RoPE) for efficient 256K context handling
Advanced Reasoning
88.3% on AIME 2026 - within 1% of the 31B model
Configurable thinking mode enables transparent step-by-step reasoning for mathematics, logic, and multi-step problem solving. The 26B MoE closes the gap with the 31B dense model to under 1 percentage point on the hardest math benchmarks.
- 88.3% on AIME 2026 mathematics (no tools)
- 82.3% on GPQA Diamond graduate-level science
- Built-in reasoning mode with step-by-step explanations
Coding Excellence
77.1% LiveCodeBench v6 with native function calling
With 77.1% on LiveCodeBench v6 and 1718 Codeforces ELO, Gemma 4 26B A4B excels at code generation, debugging, and agentic workflows. Native function calling enables autonomous agents without fine-tuning.
- 77.1% on LiveCodeBench v6 competitive coding problems
- 1718 Codeforces ELO rating
- Native function calling for autonomous agents
Multimodal Understanding
Text and image processing with variable resolution
Process text and images together with support for variable aspect ratios and resolutions. 73.8% on MMMU Pro and 82.4% on MATH-Vision demonstrate strong visual reasoning and document understanding.
- 73.8% on MMMU Pro multimodal reasoning
- 82.4% on MATH-Vision visual math problems
- Variable image resolution support (70-1120 tokens)
Get started
Try Gemma 4 26B now
Start chatting instantly, or download weights for self-hosted deployment.
Download weights
Self-hosted deployment
Download official model weights for deployment on your infrastructure.
Deploy and scale
Production deployment options
Enterprise-ready deployment on Google Cloud, Kubernetes, or your own infrastructure.
Join the Gemmaverse
Part of the broader Gemma ecosystem
Gemma 4 26B A4B is part of Google's open model family, with extensive community support, integrations, and resources.
Get started
Ready to build with Gemma 4 26B A4B?
Start chatting instantly for free, or download the model for self-hosted deployment on your infrastructure.