<RETURN_TO_BASE

OpenAI Unveils ChatGPT Agent: Revolutionizing Autonomous AI Task Automation

OpenAI has introduced ChatGPT Agent, transforming ChatGPT into an autonomous AI capable of executing complex multi-step tasks such as browsing, coding, and data analysis within a unified platform.

Transforming ChatGPT into an Autonomous Agent

On July 17, 2025, OpenAI launched ChatGPT Agent, evolving ChatGPT from a conversational AI assistant into a powerful unified agent capable of autonomously completing complex, multi-step tasks. This includes everything from web browsing to code execution, all within a virtual computer environment.

Integrating Past Tools for Enhanced Functionality

ChatGPT Agent builds upon two previous technologies: Operator, which allowed limited web interactions like clicking and scrolling via a browser-based agent, and Deep Research, which provided autonomous browsing and report synthesis over extended periods. While Operator facilitated interaction but lacked deep analysis, and Deep Research could analyze but not dynamically interact, ChatGPT Agent combines these strengths, uniting browsing, tool use, and reasoning in one cohesive architecture.

Internal Architecture and Continuous Adaptation

At its core, ChatGPT Agent operates within a virtual computer environment that includes a visual browser for standard websites, a text browser optimized for structured reasoning, a shell/terminal for code execution, and integrated API connectors for services such as Gmail and GitHub. The agent continuously decides when to click buttons, run scripts, or parse content, maintaining state across all tools to ensure traceability and flexibility.

Practical Applications and Use Cases

The agent is capable of handling a variety of real-world tasks, such as:

  • Calendar briefings by scanning your schedule, fetching related news, and summarizing meetings.
  • Grocery ordering by sourcing ingredients, comparing prices, and placing orders.
  • Competitive analysis by scraping competitor websites and creating presentations or spreadsheets.
  • Financial modeling by downloading data, updating spreadsheets, and preserving formatting. These workflows demonstrate multi-modal tool usage, including logging into websites, executing terminal scripts, and packaging results into editable documents, all under user supervision.

Performance Benchmarks and Human Comparisons

OpenAI reports impressive performance improvements across multiple benchmarks:

  • Humanity’s Last Exam: Pass@1 rate of 41.6%, reaching up to 44.4% with parallel trials.
  • FrontierMath: 27.4% accuracy utilizing terminal and code support.
  • SpreadsheetBench: 45.5% overall score on XLSX editing, outperforming Copilot in Excel (20%) and approaching human performance (~71%).
  • Internal knowledge-work benchmarks show agent tools matching or exceeding expert performance about half the time.
  • BrowseComp & WebArena achieved state-of-the-art results with 68.9% on browsing tasks.

Safety Measures and Risk Management

Acknowledging the risks inherent in autonomous agents, OpenAI has implemented several safeguards:

  • Explicit user confirmation before critical actions such as purchases or posts.
  • Watch Mode for sensitive tasks requiring active supervision.
  • Robust defenses against prompt injection, including anomaly detection and tool output monitoring.
  • Privacy protections with session-specific takeover mode and no storage of sensitive inputs.
  • Enhanced biological threat measures involving threat modeling, refusal training, live monitoring, and bug bounty programs.

Access and Getting Started

ChatGPT Agent is currently available to ChatGPT Pro, Plus, and Team users. Pro users have immediate access with 400 agent-mode messages per month, while Plus and Team users will receive gradual access with 40 messages per month. Enterprise and Education tiers will follow soon. The rollout is also expanding outside the U.S. to regions including the EEA and Switzerland. Users can activate Agent Mode via the tools menu in any conversation, specify their desired workflow, and monitor progress in real time, with the ability to pause or stop at any point.

Implications for AI-Augmented Workflows

This release marks a significant step from passive conversational AI towards proactive digital workers. By combining advanced language reasoning, tool orchestration, and context-preserving execution, OpenAI is enabling more autonomous, reliable, and action-oriented AI applications. For developers and data scientists, ChatGPT Agent offers a programmable, observable platform for scraping, parsing, synthesizing, and exporting data on demand, paving the way for next-generation workflows in research, business automation, and personal productivity.

🇷🇺

Сменить язык

Читать эту статью на русском

Переключить на Русский