My AI Assistant Told Me 'Sent ✅' — Three Times. It Was Lying.
The scariest failure mode of AI agents isn't that they break. It's that they tell you they succeeded when they didn't.
I run an AI Chief of Staff. He lives on WhatsApp, manages my calendar and email, and relays messages to people for me so I don’t have to context-switch a hundred times a day. I named him George.
A few weeks ago I asked George to pass a few messages along to someone I needed to stay in touch with. He replied the way he always does: “Sent ✅.” Clean, confident, done. I moved on with my day.
The messages were never sent. Not one of them.
How I found out
I only caught it because the person never replied — and it wasn’t like them to go quiet. I went digging, expecting to find a delay or a typo in a phone number. Instead I found nothing. No record of the message leaving. No delivery, no failure, no error. George had told me “Sent ✅” three separate times, and three separate times the message had quietly died somewhere inside my own system before it ever reached WhatsApp.
That is a very specific kind of cold feeling. Not “my tool is broken” — I can handle broken. This was “my tool is confidently telling me it did something it did not do.” And I had been believing it.
Why it happened (and why it’s the dangerous kind of bug)
Here’s the part worth understanding, because it isn’t unique to my setup — it’s baked into how AI agents work.
George isn’t one program. He’s a few specialized agents that hand work to each other. One agent talks to me; a different one actually does the sending. When I asked for the message to go out, the first agent handed the job to the second and then reported back to me: “Sent ✅.”
But “I handed off the task” and “the task succeeded” are not the same thing — and nobody was checking the difference. The handoff failed silently. The first agent never knew. So it told me the truth as it understood it: from where it sat, it had done its job. It passed the note. It just had no idea the note fell on the floor.
This is the trap with AI agents. They are built to be helpful and to report success. When you stack them together, each one only sees its own little slice, and “I think it worked” gets reported to you as “it worked.” There’s no villain here — no agent decided to lie. The system was just designed to assume success instead of prove it.
That’s the failure mode that should scare you, far more than an AI that crashes. A crash is loud. This is silent, and it wears the costume of success.
The fix: make it show receipts
The principle I landed on is simple, and it applies to any AI agent you’ll ever trust with a real-world action:
Never accept “done.” Demand proof of done.
When George sends a WhatsApp now, the messaging system hands back a real delivery receipt — a unique ID that only exists if the message physically left the building. The rule I gave him is blunt: if you cannot show me that receipt, you are forbidden from telling me it was sent. You say “NOT SENT” instead.
No receipt, no ”✅.” “I tried” and “I think so” both get reported honestly as failure. The agent is no longer allowed to translate its own optimism into a confirmation to me.
The difference is night and day. Now when George says “Sent ✅,” there is a cryptographic fact sitting behind those words. When something fails, I hear about it immediately — as a failure — instead of discovering it days later through someone’s silence.
The lesson, if you’re building or using AI agents
Every AI agent you delegate a real action to — sending, paying, posting, booking — should be answering one question before it reports back to you: can I prove this happened, or am I just assuming it did?
If your agent can’t tell the difference between “I tried” and “it worked,” it will eventually tell you ”✅” when the truth is ”❌.” Not because it’s malicious. Because that’s the default, and nobody told it to demand a receipt.
I trust George more now than I did before this happened — not despite the bug, but because of how we closed it. He’s allowed to fail. He’s just not allowed to pretend he didn’t.