AI Agents: Navigating Hype and Reality for True Digital Collaboration

Google's Vision for Agentic Experiences

At its I/O 2025 event, Google introduced what it calls a “new class of agentic experiences.” The highlight was a digital assistant capable of not just answering questions but actively assisting with tasks such as bicycle repair by locating manuals, finding tutorials, and even calling stores to inquire about parts with minimal human input. This innovation promises to extend well beyond Google's own ecosystem through the introduction of the Agent-to-Agent (A2A) open standard, enabling different agents from various companies to communicate and collaborate.

The Promise and Risks of AI Agents

The concept of intelligent software agents acting as digital coworkers—handling flights, meetings, expenses, and collaborating behind the scenes—is compelling. However, there is a significant risk that the hype around these agents will outpace their actual capabilities, potentially leading to user disappointment and backlash.

Clarifying the Term "Agent"

Currently, the term “agent” is broadly and inconsistently applied, ranging from simple scripts to complex AI workflows. This lack of a shared definition allows companies to market basic automation as advanced AI agents, causing confusion and unrealistic expectations. Clearer standards or guidelines are necessary to define what these agents can do, their autonomy, and reliability.

Challenges with Reliability and Large Language Models

Most AI agents today depend on large language models (LLMs) that generate probabilistic responses. While powerful, these models can be unpredictable, fabricate information, or fail subtly, especially with complex tasks involving multiple steps and external tools. For instance, a popular AI programming assistant’s automated support erroneously told users about a non-existent device restriction policy, resulting in user dissatisfaction.

Building Robust Systems Around LLMs

In enterprise environments, such mistakes could be costly. Treating LLMs as standalone products is insufficient; instead, comprehensive systems are needed that manage uncertainty, monitor outputs, control costs, and implement safety and accuracy guardrails. Companies like AI21 Labs are developing structured architectures that combine language models with company data and tools to ensure dependable outputs. Their recent product, Maestro, exemplifies this approach, targeting enterprise reliability.

The Importance of Agent Cooperation

For AI agents to be truly effective, they must cooperate seamlessly without constant human oversight. Google's A2A protocol aims to enable this by providing a universal communication standard among agents. However, the current protocol only defines communication mechanics, not shared meanings or contexts. This absence of common semantics can lead to fragile coordination and misunderstandings.

Real-World Challenges in Multi-Agent Ecosystems

Another challenge is that agents represent diverse entities—vendors, customers, competitors—with potentially conflicting incentives. For example, a travel agent requesting quotes from an airline booking agent might not receive unbiased results if the airline agent favors certain options. Aligning incentives through contracts or game-theoretic mechanisms is essential for genuine collaboration.

Looking Ahead

These challenges are solvable through the development of shared semantics, evolving protocols, and teaching agents negotiation skills. However, ignoring these issues risks relegating the term “agent” to just another overhyped buzzword. The excitement around AI agents should not overshadow the need for thoughtful design, clear definitions, and realistic expectations to unlock their true potential as the backbone of digital productivity.

About the Author

Yoav Shoham is a professor emeritus at Stanford University and cofounder of AI21 Labs. He is recognized for his foundational work in agent-oriented programming and multi-agent systems.