Understanding the Challenges of Agentic AI Systems
New research sheds light on why agentic AI systems struggle in real-world applications.
Overview of Agentic AI Systems
Agentic AI systems sit on top of large language models and connect to tools, memory, and external environments. They support scientific discovery, software development, and clinical research but struggle with unreliable tool use, weak long horizon planning, and poor generalization. The research paper Adaptation of Agentic AI from Stanford, Harvard, UC Berkeley, and Caltech proposes a unified view of how these systems should adapt, mapping existing methods into a mathematically defined framework.
How This Research Paper Models an Agentic AI System
The research defines an agentic AI system as a foundation model agent with three key components:
- Planning Module: Decomposes goals into sequences of actions using methods such as Chain-of-Thought and Tree-of-Thought.
- Tool Use Module: Connects the agent to various tools, such as web search engines and APIs.
- Memory Module: Stores both short-term context and long-term knowledge.
Adaptation modifies prompts or parameters for these components using techniques like supervised fine-tuning and reinforcement learning.
Four Adaptation Paradigms
The framework defines four adaptation paradigms using two binary dimensions: target (agent vs. tool adaptation) and supervision signal (tool execution vs. agent output), resulting in:
- A1: Tool Execution Signaled Agent Adaptation
- A2: Agent Output Signaled Agent Adaptation
- T1: Agent-Agnostic Tool Adaptation
- T2: Agent-Supervised Tool Adaptation
A1: Learning from Verifiable Tool Feedback
In A1, the agent receives input, produces tool calls, and the learning objective assesses tool success. Methods like Toolformer and DeepRetrieval utilize feedback from tool execution to enhance agent performance.
A2: Learning from Final Agent Outputs
This paradigm focuses on optimizing the agent based on its final outputs. Supervision on tool calls is necessary to ensure tools aren't ignored.
T1: Agent-Agnostic Tool Training
This approach freezes the main agent, optimizing tools for broad applicability, measured by metrics like retrieval accuracy.
T2: Tools Optimized Under a Frozen Agent
In T2, a powerful, fixed agent supervises learning signals for the tool. This method has been applied in recent systems like s3 and AgentFlow.
Key Takeaways
- The research presents a structured framework for adapting agentic AI systems using various adaptation paradigms.
- A1 methods leverage tool feedback, while A2 relies on final outputs, illustrating the interconnectedness of these strategies.
- T1 and T2 shift learning focus from the agent to tools and memory, enhancing robustness and scalability in practical AI applications.
Сменить язык
Читать эту статью на русском