OpenAI just launched its new ChatGPT Agent that can make as many as 1 complicated cupcake order per hour, but even Sam Altman says you probably shouldn't trust it for 'high-stakes uses'

SAN FRANCISCO, CALIFORNIA - NOVEMBER 06: OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 06, 2023 in San Francisco, California. Altman delivered the keynote address at the first-ever Open AI DevDay conference.(Photo by Justin Sullivan/Getty Images)
(Image credit: Justin Sullivan via Getty Images)

OpenAI launched ChatGPT Agent on Thursday, its latest effort in the industry-wide pursuit to turn AI into a profitable enterprise—not just one that eats investors' billions. In its announcement blog, OpenAI says its Agent "can now do work for you using its own computer," but CEO Sam Altman warns that the rollout presents unpredictable risks.

AI agents are machine learning tools intended to perform complex, multi-step tasks, and they've been the latest landmark in the AI arms race for competitors like Google and Microsoft. In prerelease demos for Wired and The Verge, OpenAI presenters used ChatGPT Agent to automate calendar planning and creating financial presentations.

(Image credit: hapabapa via Getty Images)

By blending its earlier Operator and deep research agentic models, OpenAI says Agent can perform "complex tasks from start to finish." According to OpenAI spokespeople, those tasks typically take Agent 10 or 15 minutes, while more complicated assignments take the tool longer to complete.

OpenAI research lead Lisa Fulford told Wired that she used Agent to order "a lot of cupcakes," which took the tool about an hour, because she was very specific about the cupcakes.

"It was easier than me doing it myself," Fulford said, "because I didn't want to do it."

While the potential cupcake timesavings alone are functionally infinite, Altman took to X today to warn that using Agent could present some considerable dangers—the extent of which OpenAI is apparently content to let its users figure out.

"I would explain this to my own family as cutting edge and experimental; a chance to try the future," Altman said, "but not something I’d yet use for high-stakes uses or with a lot of personal information until we have a chance to study and improve it in the wild."

Inspiring the opposite of confidence, Altman said that "bad actors may try to 'trick' users' AI agents into giving private information they shouldn't and take actions they shouldn't, in ways we can't predict." I'm not sure what utility putting those quote marks around "trick" in his X post provides, but I'm admittedly not a tech visionary.

Altman said giving Agent more than "the minimum access required" or giving it a carte blanche license to answer all your emails no questions asked could expose vulnerabilities for malicious actors to exploit. To mitigate those hazards, Altman said OpenAI has "built a lot of safeguards and warnings," but notes that the company "can't anticipate everything."

"In the spirit of iterative deployment, we are going to warn users heavily and give users freedom to take actions carefully if they want to," Altman said.

Personally, I would encourage any interested users to want to. Just a few weeks ago, the CEO of encrypted messaging app Signal warned about the security risks of 'agentic' AI and how much personal data they'll require access to. "There's no model to do that encrypted," Meredith Whittaker said in an interview at SXSW.

Worth a watch: Head of Signal, Meredith Whittaker, on so-called "agentic AI" and the difference between how it's described in the marketing and what access and control it would actually require to work as advertised.

— @keithfitzgerald.bsky.social (@keithfitzgerald.bsky.social.bsky.social) 2025-07-17T21:45:54.414Z

"There's a profound issue with security and privacy that is haunting this sort of hype around agents, and that is ultimately threatening to break the blood-brain barrier between the application layer and the OS player by conjoining all these separate services, muddying their data," Whittaker continued. "Because hey, the agent's got to get in, text your friends, pull the data out of your texts and summarize that so that your brain can sit in a jar and you're not doing any of that yourself."

OpenAI says Agent is trained to require permission before "taking actions with real-world consequences, like making a purchase"—which is good to know, but I can't help but wonder how narrow the definition of "real-world consequences" is there. Are there real-world consequences if Agent plans a shitty date itinerary?

Likewise, certain "critical tasks" like sending emails will require the user to actively supervise Agent's work. It's also trained to refuse potentially catastrophic tasks like bank transfers or other financial activities.

OpenAI also makes sure to note that it doesn't "have definitive evidence that the model could meaningfully help a novice create severe biological harm." So, you know. That's good.

ChatGPT Agent is available now for Pro users, while Plus and Team users will receive access in the next few days. I'm sure it'll be fine.

News Writer

Lincoln has been writing about games for 11 years—unless you include the essays about procedural storytelling in Dwarf Fortress he convinced his college professors to accept. Leveraging the brainworms from a youth spent in World of Warcraft to write for sites like Waypoint, Polygon, and Fanbyte, Lincoln spent three years freelancing for PC Gamer before joining on as a full-time News Writer in 2024, bringing an expertise in Caves of Qud bird diplomacy, getting sons killed in Crusader Kings, and hitting dinosaurs with hammers in Monster Hunter.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.