Lawyers Are Paying Real Money for Fake Citations
A Nebraska Supreme Court just suspended an attorney after his brief contained 57 defective citations out of 63. Twenty of those were pure hallucinations, citations to cases that flat out don't exist.
This is not an isolated incident. U.S. courts handed out at least $145,000 in sanctions against attorneys for AI citation errors in just the first quarter of 2026. That's a bad quarter to be lazy with your research tool.
Look, I get it. These tools feel authoritative. They write in full sentences, sound confident, and give you exactly what you asked for. That's the problem. A tool that always sounds right, even when it's wrong, is dangerous in a courtroom. Verify your citations. Every single one. This is not optional.
Wall Street AI Still Can't Write a Client Email
A new benchmark tested the top models, including GPT-5.4 and Claude Opus 4.6, on tasks that junior investment bankers handle every day. The result? Not a single AI output was rated as ready to send to a client.
That's a remarkable finding given how capable these models are on standard benchmarks. The gap between "impressive demo" and "professional output I'd put my name on" is still real, and it's especially real in finance where precision and context matter enormously.
This is actually useful data. It tells you what AI is good at right now, which is drafting and acceleration, not final delivery. Use it to get 80% of the way there fast, then apply your own judgment to close the gap. Don't hand the wheel over entirely.
xAI Drops a Voice Model and Flexes on the Competition
xAI launched a new flagship voice model this week that reportedly outperforms Gemini, GPT Realtime, and its own predecessor across retail, airline, and telecom workflow tests.
Voice AI is the sleeper category right now. Text interfaces get all the press, but if you're running customer support or any phone-based workflow, the quality of voice models matters a lot. xAI clearly thinks this is worth competing hard in.
The voice space is moving fast and the delta between the best and worst models is enormous when real customers are on the line. Results are what count, not benchmark charts.
GPT-5.5 Says Forget Your Old Prompts
OpenAI advised developers this week to not carry over old prompts when moving to GPT-5.5. Their recommendation is to start minimal and from scratch. Role definitions that some folks had dropped are apparently making a comeback too.
This is a real headache if you have a production system built on carefully tuned prompts. Each model generation changes the behavior enough that your old instructions can actually work against you.
Treat each major model upgrade like a new hire. You wouldn't hand a new employee a manual written for someone else and call it done. Start with what you need, watch how the model behaves, and build from there.