ChatGPT agent: bridging research and action

https://openai.com/index/introducing-chatgpt-agent/

" ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish.

You can now ask ChatGPT to handle requests like “look at my calendar and brief me on upcoming client meetings based on recent news,” “plan and buy ingredients to make Japanese breakfast for four,” and “analyze three competitors and create a slide deck.”

ChatGPT will intelligently navigate websites, filter results, prompt you to log in securely when needed, run code, conduct analysis, and even deliver editable slideshows and spreadsheets that summarize its findings."

Yeah - but can it do it on a cold, rainy night in Stoke?

OpenAI’s ChatGPT Agent Goes Full ‘Jarvis’
(Source: Daily Tech Insider newsletter - TechRepublic)

OpenAI has launched ChatGPT Agent, a unified system that fuses the Operator browser, deep-research mode, and classic ChatGPT into one task-running powerhouse. Pro subscribers get it first, with a 400-message monthly allowance; Plus and Team tiers follow with 40 each month. [Note: NO additional fee!]

The agent can click around the web (in its own visual or text-only browsers), run code in a terminal, call APIs, and churn out editable slides and spreadsheets.

Image Source: OpenAI

Safety measures include pausing for clarification, asking permission before “consequential” actions, and switching to Watch Mode when sending emails or making purchases. A one-click privacy control wipes browsing data and logs the agent out of every active site session, and new guardrails aim to spot prompt-injection attacks.

OpenAI claims the new model scored record highs on several tough benchmarks—real-world job tasks, advanced math problems, and spreadsheet editing—and matched or beat human test-takers about 50% of the time.

Early users still report latency (it took nearly an hour to order a large cupcake shipment) and note that slideshow formatting feels “beta.” Access for Enterprise/Education users is coming later this summer, while the European Economic Area has no set deadline yet.

Why it matters: If it works as billed, watching a bot book flights, update sheets, and draft decks could turn AI from chat sidekick into full-fledged digital colleague, saving clicks while adding fresh security questions (and a handy scapegoat when the cupcakes show up late).