Tech

OpenAI's New ChatGPT Agent

Jul 18th 2025 by John Ashley

AI-created, human-reviewed.

OpenAI has just unveiled its most ambitious AI project yet: ChatGPT Agent, a revolutionary tool that promises to transform how we interact with technology by giving artificial intelligence the ability to control our computers autonomously. As Tech News Weekly host Mikah Sargent noted during breaking coverage of the announcement, this represents OpenAI's "most ambitious step yet into the age of agentic AI."

The new ChatGPT Agent goes far beyond traditional text generation, offering users the ability to delegate complex, multi-step tasks to an AI system that can navigate websites, access files, run terminal commands, and even automate routine workflows. According to OpenAI's demonstration, users can now ask their AI assistant to review calendars, create presentations from scratch, summarize documents from Gmail or Google Drive, and even plan and book dinner reservations.

How ChatGPT Agent Works

The system operates through ChatGPT's familiar natural language interface, where users can simply type commands like "slash agent" or activate agent mode in the chat. However, as Sargent pointed out, "it is a little bit slower than your other ChatGPT prompts," with tasks potentially taking 15 to 30 minutes to complete. Despite this extended processing time, OpenAI product lead Isa Fulford suggests that "even if it takes 15 minutes, half an hour, it's quite a big speed up" compared to manual completion of these complex tasks.

The agent is built on reinforcement learning and has been specifically trained on workflows that require chaining multiple tools together, including browsers, terminals, and data importers. This specialized training has resulted in impressive benchmark improvements, with the agent scoring 41.6% on Humanity's last exam - double the performance of previous models.

Safety Measures and Risk Mitigation

Given the significant capabilities of an AI system that can control computers, OpenAI has implemented several safety measures that Sargent described as "measures in place to mitigate risk because of the fact that it has high capability." The company has built in multiple safeguards to prevent misuse and protect user privacy.

The agent pauses before executing any irreversible actions, such as making purchases or sending emails, requiring explicit user confirmation. If a user leaves the tab during sensitive tasks involving finances, the agent automatically stops its operations. Additionally, the system is limited to specific requests and includes no open terminal access, while maintaining a policy of no long-term data retention.

These safety features reflect what OpenAI calls "a precautionary approach" consistent with "high capability models" - a new phrase the company is using to describe these advanced AI systems. The training process specifically focused on prompt injection defense, sensitive task confirmation, and privacy-aware behaviors to ensure the agent doesn't exfiltrate or infer private information.

The Broader Context of Agentic AI

OpenAI's ChatGPT Agent enters a competitive landscape where major tech companies are racing to develop agentic AI systems. As Sargent noted, "this isn't the only group that's working on this. We've seen Google, we've seen Perplexity, we've seen Anthropic. Even Klarna have been building agentic systems." The mention of Klarna is particularly noteworthy, as their AI agent reportedly handled the work equivalent to 700 human agents in a single month.

What sets OpenAI's approach apart is its general-purpose nature. While many existing agentic models have been task-focused, designed for specific functions, ChatGPT Agent offers "combined access to apps, to terminals, and this novel model that's been trained specifically for this, so that it's all working together to provide that experience that maybe people have been thinking about when they think about AI working for this purpose."

Availability and Pricing

The ChatGPT Agent is currently available to ChatGPT Pro Plus and Team users, positioning it as a premium feature for OpenAI's subscription tiers. This release strategy suggests the company is targeting both individual power users and business customers who can benefit from automated workflow management.

Questions and Concerns

Despite the impressive capabilities, significant questions remain about the long-term viability and safety of giving AI systems partial autonomy over computer operations. As Sargent observed, "the first time one goes and buys something that I did not want it to buy, or books, I don't know a plane ticket instead of a ride share and costs me a bunch of money. It's going to be a long time before I ever want to use that thing again and ever trust that again."

The security implications are also substantial. Even in a sandboxed environment, providing AI tools with full computer access raises concerns that many users and security experts must consider carefully. The balance between capability and safety remains a critical challenge as these systems become more powerful and autonomous.

Looking Forward

OpenAI's ChatGPT Agent represents a significant milestone in the evolution of AI assistants, moving from simple text generation to complex task automation. While the technology shows remarkable promise for improving productivity and streamlining workflows, its success will ultimately depend on user adoption, safety validation, and the company's ability to maintain robust security measures as the system evolves.

The launch of ChatGPT Agent signals that the era of truly autonomous AI assistants has begun, but as with any transformative technology, careful consideration of its implications will be essential as it becomes more widely adopted.

Subscribe and never miss an episode! TWiT.tv/subscribe

Want ad-free Club TWiT exclusive podcasts? You can join Club TWiT for $10/month and get everything the club offers!

Tech News Weekly #396
Jul 17 2025 - OpenAI’s New ChatGPT Agent Is Here
OpenAI Unveils Its ChatGPT Agent

Previous Tech Post

The Great Quantum Computing Hoax

Next Tech Post

Breaking Down the GENIUS Act and Skepticism About Stablecoin Regulations

All Tech posts