- AI Weekly
- Posts
- ChatGPT's Agent Just Got Scary Good
ChatGPT's Agent Just Got Scary Good
Start learning AI in 2025
Everyone talks about AI, but no one has the time to learn it. So, we found the easiest way to learn AI in as little time as possible: The Rundown AI.
It's a free AI newsletter that keeps you up-to-date on the latest AI news, and teaches you how to apply it in just 5 minutes a day.
Plus, complete the quiz after signing up and they’ll recommend the best AI tools, guides, and courses – tailored to your needs.
Hey, Josh here. ChatGPT Just levelled up big time and here’s what we think about it.
ChatGPT Just Got Scary Good (And I'm Not Sure How to Feel About It)
Remember when ChatGPT could only chat? Those days are over.
OpenAI just dropped something that makes their old chatbot look like a calculator. They call it "Agent Mode." But here's what it really is: ChatGPT now has its own virtual computer. And it knows how to use it.
What Does That Actually Mean?
Picture this. You tell ChatGPT: "Book me a dinner reservation for Friday night somewhere nice."
Old ChatGPT would give you a list of restaurants and tell you to call them yourself.
New ChatGPT opens a browser. Searches restaurants. Checks your calendar. Finds available tables. Makes the reservation. Done.
You didn't lift a finger.
That's not the scary part yet.
It's Watching You (In a Good Way?)
The agent doesn't just browse random websites. It can log into your accounts. Read your emails. Check your calendar. Access your files.
OpenAI says they built safeguards. The system asks before doing anything "irreversible." You can stop it anytime. But still - an AI that can click around your computer like it owns the place? That's new territory.
Here's what else it can do:
Fill out web forms
Run code in a terminal
Create entire presentations
Analyze spreadsheets
Send emails
Book flights
The list goes on.
The Numbers Don't Lie
OpenAI tested this thing against other AI systems. The results are wild.
On something called "Humanity's Last Exam" (sounds ominous, right?), ChatGPT Agent scored 41.6%. That might not sound impressive until you learn that previous models barely cracked 20%.
On Excel tasks, it beat Microsoft's own Copilot by more than double. 45.5% vs 20%.
But here's the kicker - in head-to-head tests, people rated ChatGPT's agent better than Google's and Anthropic's versions. It's not just good. It's winning.
The Catch (There's Always a Catch)
You can't just spam this thing all day. Usage is limited.
Pro subscribers get about 400 agent tasks per month. Plus and Team users get 40. That's it.
Why the limit? Because running an AI agent is expensive. Way more expensive than regular ChatGPT. Each task burns through serious computing power.
Also, it's slow. Tasks take 10-15 minutes on average. Sometimes longer. You're not getting instant results here.
The Real Question: Should You Care?
Here's where it gets interesting.
This isn't just a chatbot upgrade. It's OpenAI betting their future on AI agents. They're saying the next phase isn't better conversations - it's AI that actually gets stuff done.
Think about your typical workday. How much time do you spend on repetitive tasks? Switching between apps? Copy-pasting data? Formatting reports?
What if an AI could handle all that boring stuff while you focus on the work that matters?
That's the promise. But it's still early days.
The Dark Side
Not everyone's excited about this.
Some experts warn about "agentwashing" - overhyping what these tools can actually do. Others worry about giving AI too much access to our digital lives.
And honestly? The idea of an AI browsing my computer while I'm not watching makes me a bit uncomfortable. Even with safeguards.
Plus there's the job question. If AI can handle junior analyst work (and the benchmarks suggest it can), what happens to junior analysts?
Here's What I Think
ChatGPT Agent Mode is impressive. Maybe too impressive.
The technology works. The results speak for themselves. But we're moving fast into territory nobody fully understands yet.
Will it make us more productive? Probably.
Will it replace some jobs? Definitely.
Will it create new problems we haven't thought of? Almost certainly.
The thing is, this train isn't stopping. OpenAI isn't the only company building AI agents. Google has them. Anthropic has them. Every tech company is racing to build better ones.
So maybe the question isn't whether AI agents are good or bad. Maybe it's: how do we learn to work with them before they learn to work without us?
Agent Mode is rolling out now to paid users. Free users are out of luck (for now).
If you get access, try it. But maybe start small. Let it book a restaurant reservation before you let it manage your entire calendar.
And keep an eye on what it's doing. Because even though it's your assistant, you're still the one responsible for what it does with your name on it.
The future just got a lot more interesting. Whether that's good or bad depends on what we do with it.
Reply