p>Three big AI drops happened this week and I think you need to hear about all of them. Not because they're flashy, but because they're the kind of moves that quietly shift what's coming next.

GPT-5.4 Is Now Better at Your Computer Than You Are

OpenAI shipped GPT-5.4 this week with a 1-million-token context window and the ability to autonomously run multi-step workflows across real software environments. That alone is a big deal.

But here's the part that got my attention: they tested it on OSWorld-V, a benchmark that simulates actual desktop tasks -- the kind of stuff you do in a workday. GPT-5.4 scored 75%. The human baseline is 72.4%.

Think about what that means. Not "AI wrote some code" or "AI summarized a document." AI sat down at a virtual computer and outperformed the average person at getting real work done. That's a different category of capability.

I'm not saying panic. I'm saying pay attention. The gap between "AI can help with tasks" and "AI can do tasks" just got a lot smaller.

Google Went Open Source With Gemma 4

Google dropped Gemma 4 this week under Apache 2.0. These are open models built specifically for reasoning and agentic workflows, and they're free for anyone to use, modify, or build on.

Why does this matter? Because open models change who gets to play. When Google puts serious reasoning capability into the open, small teams and solo developers get access to the same kind of infrastructure that used to be locked behind expensive APIs.

Google's pitch is "best intelligence per parameter" and honestly, if that holds up in the wild, Gemma 4 could become a go-to for anybody building agents without wanting to hand every inference dollar to OpenAI or Anthropic.

This is a smart move by Google. They're not winning the closed model race right now, so they're shifting the battlefield to open. That strategy worked for Linux. It works for Kubernetes. It might just work here too.

AI Is Hunting Zero-Days and Banks Are Getting Nervous

Anthropic previewed Claude Mythos this week, a model built specifically for cybersecurity. It's already found thousands of previously unknown zero-day vulnerabilities across major systems.

Let that sit for a second. A model trained to find security holes found thousands of them that humans hadn't caught. That's both impressive and genuinely unsettling, depending on who gets access to it.

And speaking of unsettling, the Bank of England came out this week warning financial executives about AI risk to the banking system. They're worried about a model sophisticated enough to probe financial infrastructure. That's not a theoretical concern anymore.

The cybersecurity angle is where AI gets complicated fast. The same capability that finds vulnerabilities to patch them can find vulnerabilities to exploit them. The difference is who's holding the keys. Right now, that's a question nobody has a clean answer to.

Keep your eyes on how Anthropic handles access controls for Mythos. That's going to tell us a lot about where this is all headed.

GPT-5.4 Is Now Better at Your Computer Than You Are

OpenAI shipped GPT-5.4 this week with a 1-million-token context window and the ability to autonomously run multi-step workflows across real software environments. That alone is a big deal.

I'm not saying panic. I'm saying pay attention. The gap between "AI can help with tasks" and "AI can do tasks" just got a lot smaller.

Google Went Open Source With Gemma 4

Google dropped Gemma 4 this week under Apache 2.0. These are open models built specifically for reasoning and agentic workflows, and they're free for anyone to use, modify, or build on.

AI Is Hunting Zero-Days and Banks Are Getting Nervous

Anthropic previewed Claude Mythos this week, a model built specifically for cybersecurity. It's already found thousands of previously unknown zero-day vulnerabilities across major systems.

Keep your eyes on how Anthropic handles access controls for Mythos. That's going to tell us a lot about where this is all headed.

AI Beat Humans at Computer Work. Here's What That Means

GPT-5.4 Is Now Better at Your Computer Than You Are

Google Went Open Source With Gemma 4

AI Is Hunting Zero-Days and Banks Are Getting Nervous

Related posts

Anthropic Hits $900B, Meta Charges Up, China Locks Down AI

Anthropic Hits $900B and Other AI News Worth Knowing

Claude Opus 4.8 and Ultracode: The Real Story

AI Beat Humans at Computer Work. Here's What That Means

GPT-5.4 Is Now Better at Your Computer Than You Are

Google Went Open Source With Gemma 4

AI Is Hunting Zero-Days and Banks Are Getting Nervous

Related posts

Anthropic Hits $900B, Meta Charges Up, China Locks Down AI

Anthropic Hits $900B and Other AI News Worth Knowing

Claude Opus 4.8 and Ultracode: The Real Story