<RETURN_TO_BASE

Mistral AI Unveils Magistral Series: Next-Gen Chain-of-Thought LLMs for Enterprises and Open Source

Mistral AI introduces the Magistral series, a new generation of large language models optimized for reasoning and multilingual support, available in both open-source and enterprise versions.

Introducing the Magistral Series by Mistral AI

Mistral AI has launched the Magistral series, a new lineup of large language models (LLMs) optimized for reasoning tasks. This series marks a significant advancement in LLM technology, focusing on inference-time reasoning, a crucial area in artificial intelligence development.

Key Models: Magistral Small and Magistral Medium

The Magistral series includes two main models:

  • Magistral Small: An open-source model with 24 billion parameters, available under the Apache 2.0 license. It supports multilingual reasoning and is accessible for research and commercial use through Hugging Face.
  • Magistral Medium: A proprietary model designed for enterprise use, optimized for real-time deployment via Mistral's cloud and API services, providing enhanced throughput and scalability.

Chain-of-Thought Supervision Enhances Reasoning

Both models utilize chain-of-thought (CoT) supervision, which allows them to generate intermediate reasoning steps. This technique improves accuracy, interpretability, and robustness, especially in complex reasoning tasks such as mathematics, legal analysis, and scientific problem solving.

Multilingual Capabilities Expand Global Reach

Magistral Small natively supports several languages beyond English, including French, Spanish, Arabic, and simplified Chinese. This broad language support enables its application across diverse global markets.

Impressive Benchmark Performance

Internal tests show Magistral Medium achieving 73.6% accuracy on the AIME2024 benchmark, increasing to 90% with majority voting. Magistral Small scores 70.7%, rising to 83.3% in ensemble settings, positioning these models competitively among leading LLMs.

High Throughput and Low Latency

Magistral Medium delivers inference speeds of up to 1,000 tokens per second, making it suitable for latency-sensitive production environments. These performance improvements stem from custom reinforcement learning pipelines and efficient decoding strategies.

Innovative Model Architecture

Mistral developed a bespoke reinforcement learning fine-tuning pipeline rather than using existing RLHF templates. This pipeline ensures coherent, high-quality reasoning traces. The models also implement "reasoning language alignment" to maintain consistency in complex outputs while supporting instruction tuning, code understanding, and function-calling features.

Industry Impact and Future Outlook

The Magistral series targets regulated industries such as healthcare, finance, and legal tech, where accuracy and explainability are critical. By focusing on inference-time reasoning instead of merely increasing model size, Mistral addresses the need for efficient and effective AI solutions.

Mistral’s dual approach—offering both open-source and proprietary models—caters to a wide range of users, from researchers to enterprises. Public benchmarking on platforms like MMLU, GSM8K, and Big-Bench-Hard is anticipated to further validate the series’ capabilities.

Availability and Community Engagement

Magistral Small is available on Hugging Face, while a preview of Magistral Medium can be accessed via Le Chat or Mistral’s API platform. The project encourages following their updates on Twitter, joining their ML SubReddit community, and subscribing to their newsletter for the latest developments.

🇷🇺

Сменить язык

Читать эту статью на русском

Переключить на Русский