Yandex Unveils ARGUS: A Billion-Parameter Transformer for Next-Level Recommendations

Yandex has introduced ARGUS (AutoRegressive Generative User Sequential modeling), a transformer-based recommender framework that scales up to one billion parameters. The release marks a major step forward in modeling long-term user behavior and places Yandex among a small group of tech leaders demonstrating large-scale recommender transformers in production.

Scaling recommender transformers to long horizons

Recommender systems traditionally struggle with short-term memory, limited scalability, and weak adaptability to changing user behavior. Many architectures truncate user histories to a short window of recent interactions, discarding months or years of data. That approach misses long-term habits, subtle shifts in taste, and seasonal cycles, and becomes impractical as item catalogs reach billions. ARGUS addresses these issues by modeling entire behavioral timelines, enabling a long-horizon view that captures evolving intent and recurring patterns without relying solely on recent signals.

Key technical innovations

Real-world deployment and measured gains

ARGUS is already deployed at scale on Yandex’s music platform, serving millions of users. Production A/B tests reported significant quality improvements: a 2.26% increase in total listening time (TLT) and a 6.37% increase in like likelihood. These gains are the largest recorded for any deep learning–based recommender model on the platform.

Future directions and implications

Yandex plans to extend ARGUS to real-time recommendation scenarios, explore feature engineering for pairwise ranking, and adapt the framework to high-cardinality domains like large e-commerce and video platforms. The results suggest that recommender systems can follow a scaling trajectory similar to natural language models, unlocking deeper personalization by modeling long user sequences.

ARGUS is documented in a research paper published by the Yandex team, and the framework represents a notable contribution to large-scale recommendation research and production practice.