Apriel-1.5-15B-Thinker: Frontier Multimodal Reasoning on a Single GPU

What Apriel-1.5-15B-Thinker is

ServiceNow AI Research Lab published Apriel-1.5-15B-Thinker, a 15 billion parameter multimodal reasoning model released with open weights under an MIT license on Hugging Face. The model emphasizes reproducibility and on-premise deployability by fitting a checkpoint on a single GPU while delivering frontier-level composite performance across diverse third-party evaluations.

Architecture and scaling strategy

Apriel builds from Mistral’s Pixtral-12B-Base-2409 multimodal decoder-vision stack and applies depth upscaling to expand decoder layers from 40 to 48. The team then realigns the projection network to match the enlarged decoder to the vision encoder, avoiding from-scratch pretraining and preserving the single-GPU deployment target.

Mid-training recipe: continual pretraining then SFT

The training pipeline is data-centric and proceeds in two mid-training phases without reinforcement learning or preference optimization:

A data note: roughly 25% of the text used in the depth-upscaling mix comes from NVIDIA’s Nemotron collection.

Evaluation and benchmark performance

Apriel reports an Artificial Analysis Intelligence Index or AAI of 52, a composite that aggregates ten third-party evaluations including MMLU-Pro, GPQA Diamond, AIME 2025, LiveCodeBench, SciCode, IFBench, and others. Despite being dramatically smaller than some state of the art models, Apriel matches composite scores such as DeepSeek-R1-0528 while offering significant cost savings.

Reported task-level results include:

Using VLMEvalKit for reproducibility, Apriel scores competitively across multimodal and math-focused suites such as MMMU, LogicVista, MathVision, MathVerse, MMStar, CharXiv, AI2D, and BLINK, with particularly strong results on documents, diagrams, and text-dominant math imagery.

Practical implications

Apriel’s combination of open weights, a reproducible training recipe, and a single-GPU checkpoint makes it a practical baseline for enterprises and researchers who need on-premise or air-gapped deployments with fixed memory and latency budgets. The model is intended as a cost-efficient, transparent option to evaluate before considering larger closed systems.

Where to find it

The weights, training recipe, and evaluation protocol are publicly available on Hugging Face under an MIT license for independent verification and experimentation.

Hugging Face model page: https://huggingface.co/ServiceNow-AI/Apriel-1.5-15b-Thinker Research PDF: https://huggingface.co/ServiceNow-AI/Apriel-1.5-15b-Thinker/blob/main/Apriel-1.5-Thinker.pdf