Catches tasks from email, calendar, WhatsApp, school apps. Turns rambly voice notes into tasks and calendar events. Sends a daily brief. You text her like a friend.
2 / 17
12 months of whiplash
January 2025
DeepSeek R1 drops. A Chinese lab claims frontier-level AI for $5.6M. Nvidia loses $589B in market value in a single day. Marc Andreessen calls it AI's "Sputnik moment."
May 2025
Claude 4, GPT-4.1, and Claude Code launch. AI coding tools go from autocomplete to full development agents. Claude Code hits $1B in annualized run-rate revenue within 6 months.
August 2025
GPT-5 launches. Google's Deep Think achieves gold-medal standard at the International Mathematics Olympiad. Gemini 3 follows in November.
January 2026
OpenClaw goes viral. Open-source AI agent hits 250K+ GitHub stars. Jensen Huang calls it "the most important software release probably ever."
February 2026
Claude Opus 4.6 ships with Agent Teams. 16 AI agents write a 100,000-line C compiler in two weeks. Anthropic raises $30B at a $380B valuation. Q1 venture funding hits $297B globally, 80% of it AI. Four rounds beat all of 2024 venture combined.
April 2026 Now
Anthropic ships Claude Opus 4.7 and quietly confirms Mythos, an unreleased model so good at finding security flaws they will not let the public touch it. Mythos found thousands of zero-days, including a 27-year-old bug in OpenBSD. Access is gated to about 50 companies under Project Glasswing.
3 / 17
AI agents: the hype meets reality
Agents are real. Agents are also overhyped. Both are true.
What's working
57% of organizations now have AI agents in production (up from 51% a year ago)
OpenClaw: open-source agent that handles email, browses, schedules, learns you. 250K+ stars. Nvidia shipped an enterprise build.
Claude Agent Teams: 16 AI instances coordinated to build a 100,000-line C compiler. 99% test pass rate. Two weeks.
Meta shipped Muse Spark, its first major in-house LLM after the $14B Scale AI deal, to catch up to Google and OpenAI.
What's not
Gartner predicts 40%+ of agentic AI projects will be scrapped by 2027. Quality is the #1 barrier.
OpenClaw's security mess: 138 tracked CVEs in two months, a one-click RCE patched under embargo, 824 malicious skills on ClawHub. China banned it from government devices.
Most enterprise "agent" products still fail on real-world edge cases. The gap between demo and production stays wide.
4 / 17
April 2026
Anthropic built a model called Mythos. Then didn't release it.
Why they held it back
In a few weeks of testing, Mythos found thousands of zero-day vulnerabilities across every major OS and browser. One bug was 27 years old in OpenBSD. The model chains exploits together and hacks complex software end to end.
Project Glasswing
Instead of shipping, Anthropic handed Mythos to about 50 companies (AWS, Apple, Google, JPMorgan, CrowdStrike). $100M+ in usage credits. One job: patch the internet before attackers catch up.
The capability gap opened
First time a major lab has said: we built it, we're not letting you use it. Frontier capability is now sitting behind a gate.
Security is already ahead
97% of enterprise leaders now expect a material AI-agent-driven security incident within 12 months. Only 6% of security budgets are allocated to it.
5 / 17
The shift
The coding revolution
This is the biggest practical shift happening right now.
$2.5B
Claude Code run-rate in 9 months
300K+
business customers on Claude Code
73%
of engineering teams use AI daily (up from 41%)
7
serious AI coding tools competing
Why this matters even if you're not a developer: Tools like Bolt, Lovable, and v0 turn plain English into a working app. The bottleneck between "I have an idea" and "I have a prototype" collapsed. Developers now spend more time reviewing AI code (11.4 hrs/week) than writing new code (9.8 hrs/week). The bottleneck moved.
6 / 17
Where AI is actually delivering value
Research & analysis
Deep research tools pull 100+ sources in minutes. Perplexity, Gemini Deep Research, ChatGPT Deep Research. Competitive analysis, due diligence, market scans. The gathering phase that took days now takes minutes.
Customer support
60-80% automation rates are standard, not exceptional. AI handles nuance and policy exceptions, not just FAQs. Probably the most mature enterprise AI use case.
Content & communication
First drafts, emails, proposals, decks. Not replacing writers. Killing the blank page. Only 25.6% of marketers say AI content outperforms human. The skill is editing.
Workflow automation
AI steps inside Make, n8n, and Zapier kill hours of repetitive work per week. The cost drop means this works for everyday tasks now, not just enterprise.
7 / 17
The Real Winners
Vertical AI is quietly dominating
Generic AI wrappers are dying. Industry-specific AI is scaling. They win by owning the workflow, the data, and the compliance.
Legal
Harvey: AI for law firms. $11B valuation, $190M ARR, 1,000+ customers across 60 countries.
Legora: Harvey's challenger. $5.55B valuation, 800 firms. Built on Claude.
EvenUp: AI for personal injury law. $2B+ valuation. Processes 10,000 cases/week.
CoCounsel: Thomson Reuters' AI assistant. 1 million users across 107 countries.
Healthcare
OpenEvidence -- Clinical decision tool. $12B valuation. 45% of US physicians use it daily.
Abridge -- AI clinical notes from doctor-patient conversations. $5.3B valuation. Used by Kaiser, Mayo Clinic, Johns Hopkins.
Hippocratic AI -- Patient-facing AI agents. $3.5B valuation. 180M+ patient interactions, zero safety issues.
Enterprise AI platforms: Copilot for 365, Agentforce, Slack AI. Expensive, underwhelming.
95% of gen AI pilots never move past the experimental phase (MIT).
Only 6% see significant financial returns despite 88% using AI (McKinsey).
The AI washing
50,000+ layoffs cited AI, but AI cuts were only 4.5% of total layoffs.
Sam Altman admitted: "AI washing is real."
DOJ, SEC, and FTC now actively pursuing misleading AI claims.
9 / 17
Your 2026 AI toolkit
Claude runs writing, research, reasoning. The one category still up for grabs: app builders.
Daily Drivers
Claude: writing, reasoning, analysis. The default.
Claude Design: prompt to slides, prototypes, one-pagers
NotebookLM: interactive document knowledge
ChatGPT is the also-ran now. People keep it out of habit.
Deep Research
Claude Research: best depth, cites everything
Gemini Deep Research: browses 100+ pages, good for volume
Perplexity got lapped. Claude does what Perplexity did, better.
Building & Coding · REAL FIGHT
Claude Code: terminal, deepest reasoning
Cursor: best all-around AI IDE
Lovable / v0 / Bolt: apps from plain English. Genuine competition for Claude Code on prototypes.
Productivity
Granola: local meeting notes, no bot
Wispr Flow: voice-to-text (175+ WPM)
Automation
Make: visual workflow builder
n8n: open-source, most flexible
Zapier: simplest, 8,000+ integrations
Creative
Gemini / Veo: Google, images and video
Seedance / Seedream: ByteDance, video and images
Higgsfield: cinematic AI video
ElevenLabs: voice and music
10 / 17
The new shape of work
Long-running agents arrived. Then OpenClaw reminded everyone what that costs.
Minutes to hours
Claude runs for hours without you. One task behind this deck: 100 helper agents, 32 hours, done by morning.
Hundreds of bad actors showed up
Bad actors hit the skill store with malicious add-ons. Install one, it can reach your email, files, accounts.
How to not get burned
Start small. Treat it like a new hire. Check the output, don't trust the summary. Install only what you trust.
11 / 17
Before the demos
It's a chatbot until you set it up.
Generic Claude is smart. Claude that knows your stuff is useful.
CLAUDE.md
A rules file per project. Your voice, your conventions, the stuff you don't want to explain twice. Claude reads it every session.
Skills
Prompts you keep retyping become slash commands. Save them once, run them forever.
Point it at your stuff
A repo, a folder, a local file. Generic answers lose to specific context every time.
MCP integrations
Gmail, Slack, Linear, Calendar, browser. Claude acts on your real tools, not just text.
12 / 17
Section 2
Let me show you what this looks like in practice
Live demos
13 / 17
Demo: creator pipeline
A full GTM workstream, run by Claude Code.
Research the playbook. Draft the contract. Run the campaign. Attribute the signups.
Research Cal.ai's micro-influencer playbook. Recommend Social Cat as the platform.
Adapt a lawyer's template into Chelsea Stellick's $1,750 contract.
Scrape the Social Cat inbox, download creator videos, republish to TikTok.
UTM-tag every post. Attribute signups per creator via GA4.
One GTM workstream. 60 hours of Claude. No agency, no creator manager.
14 / 17
Demo: press play
I shipped a product video by writing code.
No Premiere. No After Effects. Remotion (React for video) and Claude Code.
Wrote React components for each scene. Claude Code iterated on timing, copy, transitions.
The artifact: /Users/dilanikahawala/my-video/out/anna-demo-final.mp4 (19MB).
HOW-THIS-WAS-BUILT.md documents every prompt, in Claude Code's own words.
Live demo: press play. That's the whole demo.
This used to be a vendor or a video editor. Now it's a Tuesday night.
15 / 17
Demo · Claude Code (meta)
This deck was edited tonight.
Screenshot, prompt, refresh.
Background agents fact-checked, restyled, and mined transcripts while I kept reviewing.
Watch. I'll change a line on this slide, live.
16 / 17
Demo · Claude Code
Twilio A2P: 100 sub-agents, 32 hours, one approval
100
sub-agents spawned in a single session
9
agents running in parallel at peak
32h
session span (Feb 3-4, 2026)
4
rejections before approval
US SMS for Anna needed A2P 10DLC compliance. Terms, Privacy Policy, consent checkbox, verbatim opt-in language, 5 sample messages, screenshots.
Claude Code drafted every field, debugged each rejection (empty URL fields, summarized consent, missing opt-in message and keywords, CTA verification).
When the "Fix Campaign" form was missing fields, it walked me through deleting and recreating the campaign via the API.