SICA: The Self-Improving Coding Agent Revolutionizing Autonomous Software Development

Advancing Agentic Systems Beyond Fixed Strategies

Agentic systems, which embed large language models (LLMs) within frameworks that enable tool use and autonomous decision-making, have seen remarkable progress. However, most current implementations depend on fixed, hand-crafted orchestration strategies that limit adaptability to new and complex tasks, especially in software engineering where flexibility is crucial.

Introducing SICA: A Unified Self-Improving Agent

Researchers from the University of Bristol and iGent AI have developed SICA (Self-Improving Coding Agent), an innovative architecture that empowers the agent to iteratively enhance its own code and performance without external intervention. Unlike prior approaches that separate task execution and self-improvement roles into different agents, SICA consolidates these functions into one. This creates a continuous self-directed feedback loop where the agent evaluates its past performance, identifies weaknesses, and updates its own implementation.

Architecture and Mechanisms Behind SICA

SICA operates on a minimal, extensible base agent equipped with tools for manipulating its codebase, navigating directories, executing shell commands, and invoking sub-agents. Its iterative process follows three steps: evaluate, select, and revise.

The agent benchmarks its performance on predefined tasks using a utility function that factors in accuracy, time, and cost.
It stores results and selects effective prior versions as a foundation for improvements.
The architecture includes a sub-agent structure to break down problems and manage context within LLM constraints.
An asynchronous oversight system monitors the agent’s progress, halting execution if it diverges or stalls.
Self-editing tools like SmartEditor, AST-based symbol locators, and diff summarizers enable precise modifications to behavior.

This design allows SICA to perform controlled experiments on its own architecture and deploy updates that lead to measurable performance gains.

Empirical Results Demonstrate Significant Gains

The team tested SICA on multiple benchmarks, including SWE Bench Verified, LiveCodeBench, and synthetic tasks focused on file editing and symbol location. The results were striking:

Accuracy on SWE Bench Verified improved from 17% to 53%.
File editing accuracy increased from 82% to 94%.
Execution latency and resource usage were optimized, lowering average cost and time per task.

These improvements stemmed not from updating the underlying LLM weights but from refining tool orchestration, file management strategies, and problem decomposition heuristics.

Limitations and Future Directions

SICA’s improvements were less pronounced on reasoning-heavy tasks such as AIME and GPQA, where the base LLM already neared task ceilings. Introducing certain tool-based reasoning steps sometimes disrupted pretrained reasoning models, highlighting the need for better co-training methods that integrate agent logic with model behavior.

Towards Autonomous and Transparent Agentic Systems

SICA showcases a viable path for autonomous improvement in software agents by unifying execution and self-editing. The framework also addresses safety and transparency through LLM-based oversight and structured execution traces, ensuring control and observability.

This research lays the groundwork for hybrid optimization approaches where both the agent’s architecture and underlying models evolve together, paving the way for more adaptive and efficient software engineering agents.