Together AI Launches DeepSWE: Open-Source RL-Trained Coding Agent Achieving Top SWEBench Scores
Together AI has launched DeepSWE, an open-source, reinforcement learning-trained coding agent based on Qwen3-32B, achieving top scores on the SWEBench benchmark and setting new standards for autonomous software engineering AI.
DeepSWE: A New Era in AI-Powered Software Engineering
Together AI has introduced DeepSWE, a cutting-edge software engineering agent trained entirely through reinforcement learning (RL). Built on the powerful Qwen3-32B language model, DeepSWE achieves an impressive 59% accuracy on the SWEBench-Verified benchmark and a 42.2% Pass@1 rate. This achievement places it at the top of the leaderboard among open-weight models and marks a significant advancement in autonomous coding AI.
Reinforcement Learning Revolutionizes Code Generation
Unlike traditional supervised fine-tuning, DeepSWE leverages rLLM, Agentica’s modular RL framework designed for language agents. This approach allows DeepSWE to adapt and improve through real-world feedback rather than relying solely on static datasets. The training pipeline uses the R2EGym dataset, a benchmark tailored specifically for RL-style software engineering tasks, focusing on actionable objectives like bug fixing, function completion, and code editing. This method mirrors how human engineers iteratively refine their work based on outcomes.
Benchmark Performance and Capabilities
DeepSWE’s 59% score on SWEBench-Verified demonstrates substantial improvement over previous open-weight models. Its 42.2% Pass@1 score highlights its ability to solve problems correctly on the first attempt. These results exhibit the strength of RL-based training in enhancing AI agents’ reasoning and precision in code synthesis. The Qwen3-32B architecture supports scalability and practical application, making DeepSWE suitable for real-world software development environments.
Commitment to Open Source and Community Collaboration
Together AI and Agentica have fully open-sourced DeepSWE along with the entire training setup, including the rLLM framework, R2EGym dataset, and training scripts. This transparency promotes reproducibility and invites developers and researchers to extend or customize the agent for various applications. Resources are available via:
- Model Weights: Hugging Face – DeepSWE
- Training Framework: rLLM GitHub Repository
- Training Documentation: DeepSWE Training Overview
From Language Models to Adaptive Language Agents
DeepSWE represents a shift from static language reasoning models to dynamic agents that learn continuously through interaction. Reinforcement learning enables these agents to improve post-deployment by adapting to new challenges and workflows. This paradigm supports local deployment and customization for specific organizational needs, potentially benefiting domains like web navigation, robotics, and autonomous research assistance.
DeepSWE’s release is a pivotal step towards more intelligent, action-oriented AI agents in software engineering, combining state-of-the-art language modeling with reinforcement learning to produce adaptive, high-performance coding assistants.
Сменить язык
Читать эту статью на русском