Falcon H1R-7B: A Compact Powerhouse in Reasoning
TII's Falcon-H1R-7B leads in math and coding with 7B parameters.
Overview of Falcon H1R-7B
Technology Innovation Institute (TII), Abu Dhabi, has released Falcon-H1R-7B, a 7B parameter reasoning specialized model that matches or exceeds many 14B to 47B reasoning models in math, code, and general benchmarks, all while maintaining compact efficiency. This model builds on Falcon H1 7B Base and is available on Hugging Face under the Falcon-H1R collection.
Architectural Innovations
Falcon-H1R-7B integrates three design choices: a hybrid Transformer architecture with a Mamba2 backbone, support for a 256k token context window, and a training methodology blending supervised long-form reasoning with reinforcement learning via GRPO.
Hybrid Transformer with Mamba2 Architecture
Falcon-H1R-7B leverages a causal decoder model that incorporates Transformer layers and Mamba2 components. The Transformer blocks facilitate standard attention-based reasoning, while Mamba2 enhances linear time sequence modeling and memory management for longer contexts.
Training Protocol for Reasoning Tasks
Two-Stage Training Pipeline
- First Stage: Cold start supervised fine-tuning on Falcon-H1-7B Base, mixing long-form reasoning in three domains: mathematics, coding, and science.
- Second Stage: Refined with GRPO, rewarding correct reasoning chains with symbolic checks for valid answers in math and execution tests for code.
Performance Benchmarks
Falcon-H1R-7B sets competitive benchmarks across math and coding tasks.
- In math, it scored 73.96%, outperforming larger models like Qwen3-32B.
- Benchmarks include 88.1% on AIME 24, and 68.6% on LiveCodeBench v6.
General Reasoning Assessment
Achieves 49.48% overall in reasoning tasks, demonstrating that a well-optimized 7B model can rival larger counterparts.
Inference Throughput and Testing Efficiency
With a throughput of approximately 1,000 to 1,800 tokens per second, Falcon-H1R-7B excels in test time scaling through its Deep Think methodology, achieving striking accuracies on various benchmarks, making it highly efficient.
Key Takeaways
- Falcon-H1R-7B operates at 7B parameters while supporting a 256k token context.
- The two-stage training pipeline enhances capabilities in reasoning tasks.
- It demonstrates strong performance in math and coding benchmarks, rivaling models with a much larger parameter count.
- Inference throughput is significantly improved thanks to its hybrid architecture.
Сменить язык
Читать эту статью на русском