<RETURN_TO_BASE

JetBrains Launches Mellum: Open-Source Language Model Tailored for Developers

JetBrains has open-sourced Mellum, a 4-billion-parameter language model specialized for programming tasks, aiming to improve AI-assisted software development.

Mellum: A Language Model Built for Coding

JetBrains has unveiled Mellum, a 4-billion-parameter language model designed specifically for software development tasks. This model reflects JetBrains’ commitment to an engineering-first approach by focusing on code-related applications such as autocompletion, infilling, and structural code understanding.

Narrow Yet Deep Specialization

Mellum is described by JetBrains as a “focal model,” meaning it is specialized narrowly but deeply for programming workloads. Unlike broader general-purpose large language models (LLMs), Mellum avoids unnecessary linguistic overhead, enhancing efficiency in integrated development environment (IDE)-style contexts.

Wide Language Support

The model supports numerous programming languages including Java, Kotlin, Python, Go, PHP, C, C++, C#, JavaScript, TypeScript, CSS, HTML, Rust, and Ruby, accommodating the diverse needs of modern polyglot development teams.

Architecture and Training Details

Following a LLaMA-like architecture, Mellum was trained from scratch on over 4.2 trillion tokens sourced from code-rich datasets like The Stack, StarCoder, CommitPack, and English Wikipedia. It features an 8,000-token context window and was trained using bf16 mixed precision on a cluster of 256 NVIDIA H200 GPUs connected via Infiniband. The training spanned approximately 20 days.

Benchmark Performance

JetBrains tested Mellum on several benchmarks reflecting key use cases:

  • RepoBench v1.1 (8K context): Python EM 27.97%, Java EM 31.08%
  • SAFIM (Syntax-Aware Fill-in-the-Middle): pass@1 38.11%
  • HumanEval Infilling: Single-line 66.21%, Multi-line 38.52%, Random-span 29.70%

These results demonstrate Mellum’s strength in structured code understanding, especially for interrupted or partial code segments common in development workflows.

Open Sourcing Motivations

JetBrains open-sourced Mellum to foster transparency, allow reuse in custom environments, encourage community collaboration, and provide educational value. Both the base model (Mellum-4b-base) and a Python fine-tuned version (Mellum-4b-sft-python) are available under the Apache 2.0 license on Hugging Face.

Impact on Developer Tools

With Mellum, JetBrains aims to enhance AI-driven developer tooling by offering a compact, efficient model optimized for source code. This fits their broader vision of deploying multiple focal models for specialized programming tasks like diff generation and code review assistance, supporting cost-effective and context-aware AI integration.

Mellum represents a significant advancement toward specialized, practical language models designed specifically for software engineering, providing a robust foundation for future AI-assisted development tools.

🇷🇺

Сменить язык

Читать эту статью на русском

Переключить на Русский