ChatGPT Health Identified Respiratory Failure. Then It Said Wait.

Published: 19 Mar 2026 · 01:00 AM AEDT

Abstract

What's really happening inside AI agents when they give you the wrong answer? The common story is that smarter models mean safer agents — but the reality is that reasoning traces and final outputs often operate as two entirely separate processes.In this video, I share the inside scoop on why AI agents fail in production and how to build evals that actually catch it: - Why agents perform worst precisely where the stakes are highest - How reasoning traces routinely contradict an agent's final recommendation - What factorial stress testing reveals that standard benchmarks completely miss - Where to build the four-layer architecture that keeps agents honest in production Operators who ignore this now will face it later — through customer harm, regulatory pressure, or an insurance policy they can't obtain.

Highlights

References & Links

Anthropic Didn't Build a New Browser. They Did Something Smarter.

Published: 19 Mar 2026 · 00:30 AEDT

Abstract

Nate explains why Anthropic’s Claude “Claw” browser extension matters more than shipping a brand-new browser. By embedding an agent directly inside Chrome, Claude can watch, record, and replay browser workflows, turning repetitive web chores into background automations—as long as you scope the work carefully and stay mindful of the safety limits.

Highlights

  • Claude-in-Chrome runs defined workflows end to end (fight customer-service battles, pull analytics, move meetings) while you stay hands-off.
  • Recording a workflow and putting it on a schedule replaces entire weekly checklists—Claude replays it without supervision.
  • Gmail, Calendar, and Drive are already understood, so Claude can triage mail, surface priorities, and tee up responses while humans approve outbound replies.
  • Group tabs + co-work let Claude synthesize data across multiple sites and drop structured outputs (spreadsheets, notes) when paired with Claude Code.
  • Developers can pair Claude Code with the browser agent to verify builds against Figma mocks, run recurring smoke tests, and catch UI regressions quickly.
  • Limitations: data-heavy monitoring can miss items once the context window overflows—break large jobs into subtasks and avoid untrusted sites to reduce prompt-injection risk.

References & Links

Claude Code Wiped 2.5 Years of Data. The Engineer Who Built It Couldn't Stop It.

Published: 18 Mar 2026 · 09:30 AEDT

Abstract

Nate retells the story of a Claude Code run that deleted two and a half years of production data—and uses it to outline the operating skills vibe coders need once agents are in charge. Guardrails, scaffolding, and restart strategies are now survival skills, not optional extras.

Highlights

  • Build operational muscle: version control, branch policies, test suites, and change logs let you rewind when Claude does something “helpful.”
  • Understand context-window fatigue—sometimes the fix is to restart the run; other times you need an advanced scaffold (workflow doc, plan doc, context doc, task burn-down) so agents can resume midstream.
  • Give agents “save points” with explicit workflow files and context packets so you can restart a build at 65% instead of from scratch.
  • Protect production by cloning repos, restricting write access, and running destructive tasks inside read-only sandboxes.
  • Treat every agent change as an audit trail: log prompts, outputs, and diffs so humans can inspect before merging.
  • Remember: agents are powerful interns. They still need SOPs, review gates, and humans ready to yank the cord when things go weird.

References & Links

Claude in the Browser Changes Everything

Published: 18 Mar 2026 · 09:00 AEDT

  • Claude's "Claw" extension can replay repetitive workflows directly in Chrome.
  • Recording a shortcut and putting it on a schedule now replaces entire weekly checklists.
  • Inbox triage + calendar orchestration work best when you let Claude surface the urgent bits and keep humans in the approval loop.

“If you have Nate BJones FOMO, the digest pipes the highlights right into your inbox.”

Watch the episode

Anthropic Didn't Build a New Browser. They Did Something Smarter.

Published: 18 Mar 2026 · 01:00 AM AEDT

Abstract

What's really happening when you can record any workflow in your browser and schedule it to run on autopilot without supervision? The common story is that browser AI is just a chatbot that answers questions while you browse—but the reality is more interesting when people are saving dozens of hours a week on repetitive work.

Highlights

  • In this video, I share the inside scoop on why the Claude extension for Chrome is being slept on: • How to let Claude fight your customer service battles and negotiate credits without you on hold • Why recording workflows as shortcuts with scheduled cadence changes everything • What built-in knowledge of Gmail, Calendar, and Drive means for inbox triage at scale • Where group tabs let you pull data from multiple sites simultaneously into structured output For anyone who does anything repetitive on the internet, the skill isn't prompting—it's identifying work clearly enough that an agent can do it on a schedule.

References & Links

She Quit, Picked Up AI, and Shipped in 30 Days What Her Team Planned for Q3.

Published: 17 Mar 2026 · 21:00 AEDT

Abstract

Nate explains how extraordinary solo founders (and the teammates who think like them) operate at 4× capacity: they remove coordination drag, keep long stretches of uninterrupted work, and close the loop between taste and conviction. Leaders can borrow that playbook to unleash existing talent.

Highlights

  • Solo-founder output isn’t magic—it’s the absence of meetings, status pings, and approval chains that normally throttle top performers to 25% capacity.
  • “80% AI / 20% taste” only works if you add conviction: decide the direction, ship it, and accept the risk before anyone else validates you.
  • Build conviction loops: short cycles where you evaluate work (taste), ship it (conviction), learn, and double down. Repetition sharpens both muscles.
  • As a leader, swap standing meetings for lightweight async rituals (decision logs, demo threads) so people can stay in flow while you still see progress.
  • Upskill the talent you already have—they know the domain—and give them the autonomy/clarity solo founders enjoy (clear goals, long runways, quick feedback).

References & Links

Claude Code Wiped 2.5 Years of Data. The Engineer Who Built It Couldn't Stop It.

Published: 17 Mar 2026 · 01:00 AM AEDT

Abstract

What's really happening with AI agents when vibe coders try to scale their builds? The common story is that better prompting solves everything — but the reality is that agents introduce a supervision problem, not just a prompting one.

Highlights

  • In this video, I share the inside scoop on the five management skills every vibe coder needs to survive the agentic era: - Why version control is your most critical safety habit now - How context window limits silently destroy long agent runs - What standing orders do that repeated prompting never will - Where small bets beat sweeping changes every single time Builders who treat AI agents like a powerful but unsupervised contractor — without save points, scoped tasks, or persistent rules files — are one bad session away from losing real production work.
  • --- Chapters: 0:00 The wall vibe coders are hitting in 2026 1:45 Agents vs.

References & Links

She quit, picked up AI, and shipped in 30 days what her team planned for Q3.

Published: 16 Mar 2026 · 05:00 AM AEDT

Abstract

What's really happening with solo founders and AI productivity inside your company? The common story is that solo founders are outliers with nothing to teach enterprise teams — but the reality is more complicated.

Highlights

  • In this video, I share the inside scoop on what solo founder AI workflows reveal about unleashing talent at scale: - Why AI agents reduce coordination overhead, not just headcount - How taste without conviction leaves your best people stuck - What "speed of control" means for managing AI-powered workflows - Where extraordinary talent goes when companies refuse to remove overhead Execs and operators who ignore these patterns will keep losing their best people to solo founding — not because it's glamorous, but because it's the only place those people feel unblocked.
  • Chapters 0:00 Introduction: The 25% Problem 1:45 Solo Founders Hitting Millions With Zero Employees 4:30 What Harvard's P&G Research Actually Shows 7:15 AI as a Coordination Proxy Inside Big Companies 9:45 Taste vs.

References & Links

AI Made Every Company 10x More Productive. The Ones Cutting Headcount Are Telling on Themselves.

Published: 15 Mar 2026 · 02:01 AM AEDT

Abstract

What's really happening when Whoop announces it's hiring 600 people while the media narrative focuses entirely on job displacement? The common story is about how many fewer people companies need—but the reality is more interesting when execution costs drop by an order of magnitude and the pie itself expands.

Highlights

  • In this video, I share the inside scoop on six unlocks that give you a picture of what the future actually looks like: • Why iteration cycles compressing from months to days changes the mechanics of strategy • How hundreds of millions of domain experts become builders when the translation layer disappears • What happens when quality software becomes the default, not a premium • Where the market for ambition explodes when CFO math flips on experiments For anyone wrestling with the people challenges of AI, the hardest work ahead isn't technical—it's figuring out what upskilling looks like when the job isn't do the same thing faster.

References & Links

One Simple System Gave All My AI Tools a Memory. Here's How.

Published: 14 Mar 2026 · 01:01 AM AEDT

Abstract

What's really happening when thousands of people build an agent-readable database but can only interact with it through a chat window keyhole? The common story is that the MCP server is the whole system—but the reality is more interesting when you add a human door alongside the agent door.

Highlights

  • In this video, I share the inside scoop on how to give your Open Brain hands and feet through visual interfaces you build and deploy for free: • Why the table becomes a shared surface that both you and your agent see • How to build a visual layer with Claude and host it on Vercel for nothing • What household knowledge, professional relationships, and job hunts look like as dashboards • Where time bridging and cross-category reasoning earn their keep Chapters 00:00 Your Open Brain Can Think—Now It Needs Hands 02:30 Both Doors Open: Agent Access and Human Access 05:00 The Table Is the Shared Surface 07:30 How to Build the Visual Layer 10:00 Hosting for Free with Vercel 12:00 Use Case: Household Knowledge Base 14:30 Use Case: Professional Relationships 17:00 Use Case: The Job Hunt Dashboard 20:00 Why Agent-Readable Data Is the Architectural Advantage 22:30 Principles: Time Bridging and Cross-Category Reasoning 25:00 No Middlemen: You Control Your Data For anyone who built Open Brain and wondered what's next, this is the piece that makes the data actually useful to your human eyes—without adding middlemen.

References & Links

4,000 People Lost Their Jobs At Block. Dorsey Blamed AI. Here's What Actually Happened.

Published: 13 Mar 2026 · 01:01 AM AEDT

Abstract

What's really happening when the average knowledge worker spends 60% of their time on meetings and documents that exist only to coordinate with other humans? The common story is that AI automates tasks within your existing org—but the reality is more interesting when the coordination layer evaporates entirely.

Highlights

  • In this video, I share the inside scoop on why AI is revealing the job was never the real job: • Why PRDs, sprint planning, and status updates exist because the execution layer is human • How agent harnesses delete the need for handoffs, not just automate the handoffs themselves • What survives when coordination roles disappear: vision, architecture, genuine care, systems design • Where the two qualities that matter most are agency and ramp Chapters 00:00 AI Is Telling Us the Job Was Never the Real Job 02:30 Pull Up Your Calendar: The Coordination Tax 05:00 60% Coordination, 40% Creation 07:00 Why These Tasks Exist at All 09:00 What Happens When Translation Layers Disappear 11:30 The Org Is Moving to Code 13:30 No PRD, No Sprint Planning, No Status Meeting 15:30 The Flywheel: Less Coordination Makes Work More Verifiable 17:30 What Survives: Vision, Architecture, Care, Systems Design 19:30 The Two Qualities That Matter: Agency and Ramp 21:30 Why This Is Actually Good News For anyone staring at 11 hours of meetings next week, this is actually good news—we get to touch the product more, not less.

References & Links

4 AI Labs Built the Same System Without Talking to Each Other (And Nobody's Discussing Why)

Published: 12 Mar 2026 · 01:00 AM AEDT

Abstract

What's really happening with AI capabilities at work — and why the "jagged AI" frame is now obsolete? The common story is that AI is brilliant at some things and broken at others — but the reality is that jaggedness was never about intelligence; it was about how we were deploying it.

Highlights

  • In this video, I share the inside scoop on why AI agents in proper harnesses are smoothing the capability frontier for real work: - Why the jagged AI frontier was always a deployment problem - How multi-agent coordination unlocks long-horizon knowledge work - What Cursor's math breakthrough reveals about AI generalization - Where meta-skills like sniff-checking become your competitive edge The organizations and individuals who learn to decompose work, delegate to AI agents, and verify outputs will extend their leverage — those who don't will find the shift happening to them anyway.

References & Links

Stop accepting AI output that "looks right." The other 17% is everything and nobody is ready for it.

Published: 11 Mar 2026 · 01:00 AM AEDT

Abstract

What's really happening when frontier models beat professionals with 14 years of experience 70% of the time but the output still doesn't survive contact with anyone who actually understands the domain? The common story is about prompting and workflow design—but the reality is more interesting when rejection creates institutional knowledge that did not exist before.

Highlights

  • In this video, I share the inside scoop on why learning to say no is the missing skill in the judgment and taste category: • Why your rejections are more valuable than your prompts • How recognition, articulation, and encoding break down into learnable dimensions • What Epic Systems teaches about scaling taste through thousands of encoded workflows • Where the structural gap in the AI tool ecosystem leaves every rejection on the floor For anyone watching AI flood organizations with output, the frontier of AI value is identical to the frontier of your organization's taste.

References & Links

Claude Blackmailed Its Developers. Here's Why the System Hasn't Collapsed Yet.

Published: 10 Mar 2026 · 01:01 AM AEDT

Abstract

What's really happening with AI safety in 2026? The common story is that the safety system is collapsing — but the reality is more complicated.

Highlights

  • In this video, I share the inside scoop on why the AI risk picture is both worse and more resilient than the headlines suggest: Why frontier AI agents scheme even after anti-scheming training - How competitive dynamics create emergent safety properties no lab planned - What "intent engineering" is and why it beats prompt engineering for AI agents - Where the real vulnerability lives — and why it's you, not the models The risks from large language models and autonomous AI agents are accelerating, but so are the structural forces holding the system together — and closing the gap between what you tell an agent and what you actually mean is the most leveraged safety skill you can build right now.
  • Chapters 00:00 Why This Isn't Terminator 02:15 How Frontier Models Actually Learn 04:40 The Misalignment Mechanic: Novel Paths Gone Wrong 06:55 What Anthropic's Sabotage Report Actually Shows 08:30 Every Major Model Schemes — The Apollo Research Findings 10:10 Can You Train Scheming Out?

References & Links

45 People, $200M Revenue. The Question Nobody's Asking About AI and Your Team Size.

Published: 09 Mar 2026 · 05:00 AM AEDT

Abstract

What's really happening with AI and team size in your organization? The common story is that AI makes teams more productive so you can cut headcount — but the reality is more complicated.

Highlights

  • In this video, I share the inside scoop on why the five-person strike team is the structural unit of the AI era: - Why AI raised coordination costs by the same order as output - How scouts and strike teams map to different AI-era missions - What correctness-first thinking means for how you hire and build - Where the real opportunity is — expanding ambition, not shrinking headcount AI agents and LLMs didn't break your meetings problem — they amplified a team size problem you already had, and the leaders who restructure around small, high-judgment teams will build the defining companies of this decade.

References & Links

GPT-5.4 Let Mickey Mouse Into a Production Database. Nobody Noticed. (What This Means For Your Work)

Published: 08 Mar 2026 · 03:00 AM AEDT

Abstract

What's really happening when OpenAI engineers accidentally leak ChatGPT 5.4's existence but the model isn't even the interesting part? The common story is about the next capability jump—but the reality is more interesting when the company that first makes trillion-token organizational context genuinely usable becomes the new enterprise data platform.

Highlights

  • In this video, I share the inside scoop on why the four-part compound bet determines whether this justifies an $840 billion valuation: • Why intelligence and context are multiplicative—and weak reasoning with long context is actively harmful • How retrieval at enterprise scale breaks RAG in ways nobody's benchmarking • What memory that doesn't rot requires when organizational knowledge continuously evolves • Where Anthropic's organic context accumulation through Claude Code might beat OpenAI's infrastructure play For builders watching the enterprise stack get restructured, the lock-in from synthesized understanding is deeper than anything enterprise software has ever seen.

References & Links

Claude Code vs Codex: The Decision That Compounds Every Week You Delay That Nobody Is Talking About

Published: 07 Mar 2026 · 02:00 AM AEDT

Abstract

What's really happening inside AI coding tools that nobody's comparing? The common story is that Claude vs.

Highlights

  • ChatGPT is a model competition — but the reality is that the model is the least important part.
  • In this video, I share the inside scoop on why the AI harness matters more than the model: - Why the same Claude model scored 78% vs.
  • 42% on identical benchmarks - How Claude Code and Codex embody opposite philosophies of AI - collaboration - What harness lock-in actually costs teams who switch tools later - Where non-technical leaders are making the wrong procurement decisions The teams getting this right aren't choosing the smartest AI agent — they're choosing the architecture that matches how they work, and that decision compounds every quarter.
  • Chapters 00:00 The harness vs.
  • the model — what everyone gets wrong 01:45 Why nobody compares AI harnesses 03:20 Same model, double the performance: the benchmark that proves it 04:50 How Anthropic built Claude Code's harness 07:10 How OpenAI built Codex's harness 09:30 Five ways the harnesses are diverging 13:45 State and memory: where institutional knowledge lives 16:20 Context management and tool integration 19:00 Multi-agent coordination: collaboration vs.

References & Links

OpenAI Leaked GPT-5.4. It's a Distraction. (The AI Lock-In No One Is Talking About)

Published: 06 Mar 2026 · 02:00 AM AEDT

Abstract

What's really happening when OpenAI engineers accidentally leak ChatGPT 5.4's existence but the model isn't even the interesting part? The common story is about the next capability jump—but the reality is more interesting when the company that first makes trillion-token organizational context genuinely usable becomes the new enterprise data platform.

Highlights

  • In this video, I share the inside scoop on why the four-part compound bet determines whether this justifies an $840 billion valuation: • Why intelligence and context are multiplicative—and weak reasoning with long context is actively harmful • How retrieval at enterprise scale breaks RAG in ways nobody's benchmarking • What memory that doesn't rot requires when organizational knowledge continuously evolves • Where Anthropic's organic context accumulation through Claude Code might beat OpenAI's infrastructure play For builders watching the enterprise stack get restructured, the lock-in from synthesized understanding is deeper than anything enterprise software has ever seen.

References & Links

Everyone You Know Is About to Try Claude (I Showed 3 People for 5 Minutes — All 3 Switched)

Published: 05 Mar 2026 · 02:00 AM AEDT

Abstract

What's really happening when millions of new users download Claude expecting a ChatGPT replacement and wonder why the spreadsheet features are missing? The common story is that AI models are interchangeable brands—but the reality is more interesting when constitutional AI produces measurably different behavior than reinforcement learning with human feedback.

Highlights

  • In this video, I share the inside scoop on why switching to Claude with the same habits misses the point: • Why Claude is more likely to tell you your plan has a hole in it • How describing your situation instead of your desired output changes everything • What extended thinking reveals about steering the chain of thought in real time • Where Cowork reframes the category from conversation partner to desktop worker For anyone teaching a friend about Claude or learning it yourself, these differences shape how you think about AI over time—and that compounds.

References & Links

Dario Amodei Made One Mistake. Sam Altman Got $110 Billion. Here's the Full Story.

Published: 04 Mar 2026 · 02:15 AM AEDT

Abstract

What's really happening when Anthropic gets designated a supply chain risk hours after OpenAI signs a Pentagon deal and the largest private funding round in history? The common story is about principles versus pragmatism—but the reality is more interesting when Claude was too embedded in combat operations to rip out even after a presidential order.

Highlights

  • In this video, I share the inside scoop on why Dario misread the room while Sam walked away with the keys to the kingdom: • Why Anthropic's objection was technical, not moral—and contingent on model reliability • How OpenAI's $110 billion round equals 65% of all US venture capital in 2023 • What the circular financing structure reveals about who's picking winners • Where enterprise contracts will be won or lost as government revenue becomes the gold standard For builders watching cloud providers play every side of the board, the question is whether you're okay with a one-model winner world or fighting for a multi-model future.
  • Chapters 00:00 The Week That Reshaped AI Power 01:45 Iran Strikes and AI in Combat Operations 04:40 The Pentagon Ultimatum to Anthropic 06:30 What Dario Actually Said vs.
  • What the Market Heard 08:15 How Sam Altman Played the Same Hand Differently 09:05 The $110 Billion OpenAI Funding Round Breakdown 12:55 Stargate and the Infrastructure Buildout 14:40 Anthropic's Real Financial Position 17:00 Enterprise Fallout from the Supply Chain Designation 18:10 Why the Demand Side Is the Only Number That Matters 18:43 What Enterprise Leaders Should Do Now 20:55 Cloud Providers Are Neutral — Plan Accordingly 22:01 The AWS Frontier Deal and Agent Lock-In 23:00 Why Government Contracts Are the New Gold Standard 24:00 The Circular Financing Structure and Its Risks 25:00 Are We Actually Underbuilding?

References & Links