Random Llama
Random Llama
ProductsSolutionsBlogCase StudiesContact
Get a Quote
Weekly Newsletter

Get AI & productivity insights weekly

Privacy-first tools, workflow tips, and early product access. No spam — unsubscribe anytime.

Random Llama Software

Texas-built weird tools and custom web platforms—fast shipping, no creepy tracking, no enterprise bloat.

Links
  • Home
  • Products
  • Case Studies
  • Blog
  • Solutions
  • Credentials
  • Contact
Services
  • Custom CMS
  • Booking Engines
  • Mobile Apps
  • AI Integration
Connect
  • Privacy Policy
  • Terms of Service
  • Cookie Policy

© 2026 Random Llama Software, LLC. All rights reserved. Privacy Policy

Back to Blog
ai-newsdeepseekai-healthcareai-infrastructureopen-source

DeepSeek Ditched Nvidia and Utah Let AI Write Prescriptions

Robert HattalaApril 6, 2026

DeepSeek just confirmed what a lot of us suspected was coming. Their V4 model is running on Huawei Ascend chips. Not Nvidia. That is a big deal and I don't think enough people are talking about it.

The Nvidia Monopoly Just Got Its First Real Crack

Every frontier AI model you've used this year runs on Nvidia hardware. GPT-5, Claude, Gemini, all of them. DeepSeek V4 is the first competitive model that doesn't need a single Nvidia GPU. It's a trillion-parameter mixture-of-experts beast, only about 37 billion parameters active per response, running entirely on Huawei's Ascend 950PR silicon.

Reuters confirmed the Huawei chip detail on April 4th. This matters for two reasons. First, it proves you can train frontier models without Nvidia. Second, it changes the entire calculus around export controls. The US strategy of choking China's AI progress through chip restrictions just hit a wall.

For builders and investors, the takeaway is simple. The GPU monoculture is ending. Competition in AI silicon is about to get very real.

Utah Is Letting AI Renew Your Prescriptions

Utah became the first state to let an AI system autonomously renew prescriptions for chronic conditions. A company called Doctronic built the agent. It covers 192 drugs for things like hypertension, diabetes, and depression.

Before you panic, there are guardrails. Human doctors review the first 250 patients. Complex cases get kicked to clinicians automatically. Patient data can't be repurposed. It's a sandbox, not a free-for-all.

But still. An AI agent is now making medical decisions that affect real patients in a real state. This is not a research paper or a demo. It's production healthcare. If this pilot goes well, expect ten more states to follow by year end.

The question isn't whether AI will practice medicine. It's whether the regulatory frameworks can keep up with how fast it's already happening.

Google's TurboQuant Makes AI Remember More With Less

Google dropped a paper at ICLR 2026 called TurboQuant. It's a compression algorithm that shrinks the memory footprint of large language models by at least 6x with zero accuracy loss. No retraining required.

The technical version is that it uses vector quantization to compress the KV cache, which is the thing that lets models remember context during a conversation. The practical version is that models can now hold much longer conversations and process bigger documents without needing more hardware.

This is the kind of boring infrastructure breakthrough that actually changes what's possible. If you can run the same quality model on a third of the memory, suddenly on-device AI gets way more capable. Edge deployments become realistic. The cost per query drops hard.

Between this and the Caltech compression work from last week, model efficiency is having a real moment right now.

MCP Is Now Officially Open Governance

Anthropic donated the Model Context Protocol to the Linux Foundation's new Agentic AI Foundation. MCP crossed 97 million installs in March. Every major AI provider now ships MCP-compatible tooling.

I've been saying MCP is the USB standard of AI tooling. This move makes that comparison even more apt. Open governance means no single company controls the protocol. That's how you build something the whole industry can trust and build on.

If you're building AI tools or agents and you're not thinking about MCP compatibility, you're going to regret it in about six months.

Related posts

AI Models Are Lying Now and Other Things That Should Worry You

Trillion-parameter models beating humans, AIs lying to protect each other, a compression breakthrough that changes the game, and quantum encryption threats getting real.

April 5, 2026

Google Goes True Open Source, Alibaba Ships 1M Tokens, and Anthropic Builds an AI That Never Sleeps

Gemma 4 drops under Apache 2.0, Qwen 3.6-Plus brings a million-token context window for agentic coding, and Anthropic is quietly testing an always-on agent called Conway.

April 4, 2026

DeepSeek V4 Just Dropped: 1 Trillion Parameters for $5.2M

DeepSeek V4 ships a trillion-parameter open-weights model trained for $5.2 million. Meanwhile GPT-5.4 Thinking passes human benchmarks and Netflix releases its first AI model.

April 2, 2026
All posts