Understanding the Challenges of Agentic AI Systems

Overview of Agentic AI Systems

Agentic AI systems sit on top of large language models and connect to tools, memory, and external environments. They support scientific discovery, software development, and clinical research but struggle with unreliable tool use, weak long horizon planning, and poor generalization. The research paper Adaptation of Agentic AI from Stanford, Harvard, UC Berkeley, and Caltech proposes a unified view of how these systems should adapt, mapping existing methods into a mathematically defined framework.

How This Research Paper Models an Agentic AI System

The research defines an agentic AI system as a foundation model agent with three key components:

Planning Module: Decomposes goals into sequences of actions using methods such as Chain-of-Thought and Tree-of-Thought.
Tool Use Module: Connects the agent to various tools, such as web search engines and APIs.
Memory Module: Stores both short-term context and long-term knowledge.

Adaptation modifies prompts or parameters for these components using techniques like supervised fine-tuning and reinforcement learning.

Four Adaptation Paradigms

The framework defines four adaptation paradigms using two binary dimensions: target (agent vs. tool adaptation) and supervision signal (tool execution vs. agent output), resulting in:

A1: Tool Execution Signaled Agent Adaptation
A2: Agent Output Signaled Agent Adaptation
T1: Agent-Agnostic Tool Adaptation
T2: Agent-Supervised Tool Adaptation

A1: Learning from Verifiable Tool Feedback

In A1, the agent receives input, produces tool calls, and the learning objective assesses tool success. Methods like Toolformer and DeepRetrieval utilize feedback from tool execution to enhance agent performance.

A2: Learning from Final Agent Outputs

This paradigm focuses on optimizing the agent based on its final outputs. Supervision on tool calls is necessary to ensure tools aren't ignored.

T1: Agent-Agnostic Tool Training

This approach freezes the main agent, optimizing tools for broad applicability, measured by metrics like retrieval accuracy.

T2: Tools Optimized Under a Frozen Agent

In T2, a powerful, fixed agent supervises learning signals for the tool. This method has been applied in recent systems like s3 and AgentFlow.

Key Takeaways

The research presents a structured framework for adapting agentic AI systems using various adaptation paradigms.
A1 methods leverage tool feedback, while A2 relies on final outputs, illustrating the interconnectedness of these strategies.
T1 and T2 shift learning focus from the agent to tools and memory, enhancing robustness and scalability in practical AI applications.