FILTER MODE ACTIVE

#LiveCodeBench

Records found: 6

#LiveCodeBench04/11/2025

Choosing the Right Coding LLM in 2025: A Head-to-Head of 7 Leading Systems

A concise comparison of seven leading 2025 code-focused LLMs and systems, outlining strengths, limits, and recommended use cases for engineering teams

READ →

#LiveCodeBench09/10/2025

RA3: Temporal Action Abstractions to Speed Up RL Post-Training in Code LLMs

'RA3 formalizes mid-training as pruning plus horizon shortening and uses temporal action abstractions to accelerate RL post-training, boosting code generation benchmarks.'

READ →

#LiveCodeBench28/08/2025

Hermes 4: Hybrid Reasoning Powers Open-Weight Models to Frontier-Level Performance

'Nous Research unveils Hermes 4: open-weight models that use hybrid reasoning, graph-based synthetic data, and large-scale verification to reach frontier-level open-source performance.'

READ →

#LiveCodeBench31/07/2025

2025 Coding LLMs: Benchmarking, Metrics, and Top Performers Unveiled

Explore the comprehensive 2025 benchmarks and metrics evaluating top coding large language models, highlighting key performers like OpenAI, Gemini, and Anthropic in real-world developer scenarios.

READ →

#LiveCodeBench07/06/2025

Google AI Unveils MASS: A Breakthrough Framework Optimizing Multi-Agent Systems with Smarter Prompts and Topologies

Google AI and University of Cambridge introduce MASS, a novel framework that optimizes multi-agent systems by jointly refining prompts and topologies, achieving superior performance across multiple AI benchmarks.

READ →

#LiveCodeBench08/05/2025

NVIDIA Releases Open-Source Open Code Reasoning Models with Unmatched Code Intelligence

NVIDIA has released its Open Code Reasoning models (32B, 14B, 7B) as open-source under Apache 2.0, delivering top-tier performance in code reasoning tasks and broad compatibility with popular AI frameworks.

READ →