Together AI Launches DeepSWE: Open-Source RL-Trained Coding Agent Achieving Top SWEBench Scores

DeepSWE: A New Era in AI-Powered Software Engineering

Together AI has introduced DeepSWE, a cutting-edge software engineering agent trained entirely through reinforcement learning (RL). Built on the powerful Qwen3-32B language model, DeepSWE achieves an impressive 59% accuracy on the SWEBench-Verified benchmark and a 42.2% Pass@1 rate. This achievement places it at the top of the leaderboard among open-weight models and marks a significant advancement in autonomous coding AI.

Reinforcement Learning Revolutionizes Code Generation

Unlike traditional supervised fine-tuning, DeepSWE leverages rLLM, Agentica’s modular RL framework designed for language agents. This approach allows DeepSWE to adapt and improve through real-world feedback rather than relying solely on static datasets. The training pipeline uses the R2EGym dataset, a benchmark tailored specifically for RL-style software engineering tasks, focusing on actionable objectives like bug fixing, function completion, and code editing. This method mirrors how human engineers iteratively refine their work based on outcomes.

Benchmark Performance and Capabilities

DeepSWE’s 59% score on SWEBench-Verified demonstrates substantial improvement over previous open-weight models. Its 42.2% Pass@1 score highlights its ability to solve problems correctly on the first attempt. These results exhibit the strength of RL-based training in enhancing AI agents’ reasoning and precision in code synthesis. The Qwen3-32B architecture supports scalability and practical application, making DeepSWE suitable for real-world software development environments.

Commitment to Open Source and Community Collaboration

Together AI and Agentica have fully open-sourced DeepSWE along with the entire training setup, including the rLLM framework, R2EGym dataset, and training scripts. This transparency promotes reproducibility and invites developers and researchers to extend or customize the agent for various applications. Resources are available via:

Model Weights: Hugging Face – DeepSWE
Training Framework: rLLM GitHub Repository
Training Documentation: DeepSWE Training Overview

From Language Models to Adaptive Language Agents

DeepSWE represents a shift from static language reasoning models to dynamic agents that learn continuously through interaction. Reinforcement learning enables these agents to improve post-deployment by adapting to new challenges and workflows. This paradigm supports local deployment and customization for specific organizational needs, potentially benefiting domains like web navigation, robotics, and autonomous research assistance.