OpenAI Bought Its Own Bug Finder — Grace Hopper Found the First Bug in 1947
On September 9, 1947, Grace Hopper's team taped a moth into a logbook — the first computer 'bug.' In 2026, OpenAI acquired Promptfoo to find bugs in AI agents. The oldest problem in computing just got harder.
Key Takeaways
- •Grace Hopper's team found the first computer 'bug' — a literal moth — on September 9, 1947
- •OpenAI acquired Promptfoo in early 2026 to test AI agents for security issues
- •AI testing is fundamentally harder than traditional testing because behavior is non-deterministic
- •As AI agents take real-world actions, bugs escalate from 'wrong answer' to 'wrong action'
Root Connection
The concept of 'debugging' traces to Grace Hopper's Harvard Mark II logbook in 1947 — but the practice of testing software predates it, going back to Ada Lovelace's 1843 notes on the Analytical Engine.
Timeline
Ada Lovelace writes the first algorithm and notes potential errors in machine computation
Grace Hopper's team finds a moth in Harvard Mark II — coins 'debugging'
Glenford Myers publishes 'The Art of Software Testing' — formalizes the discipline
Selenium released — automated web testing becomes standard practice
ChatGPT launches — AI behavior becomes the new testing challenge
OpenAI acquires Promptfoo — AI companies need bug finders for their own AI
On September 9, 1947, operators of the Harvard Mark II computer found the source of a malfunction. A moth had gotten trapped in Relay #70, Panel F. Grace Hopper's team taped the moth into the logbook with a note: 'First actual case of bug being found.'
The term 'bug' for a technical glitch predated this moment — Thomas Edison used it in the 1870s. But Hopper's moth gave 'debugging' its visceral, literal origin story. For 79 years, the discipline of finding and fixing software errors has been the unglamorous backbone of computing.
In early 2026, OpenAI acquired Promptfoo, a startup that helps companies find security vulnerabilities in AI systems. The acquisition reveals something remarkable: the company building the most advanced AI in the world needs someone else to test it.
This isn't surprising when you understand the problem. Traditional software is deterministic — give it the same input, you get the same output. You write a test: 'If I pass 2 + 2, I expect 4.' If the test passes, the code works. If it fails, you debug.
Grace Hopper's moth was a hardware bug. OpenAI's bugs are philosophical — what does 'correct' even mean when the output is non-deterministic?
AI is non-deterministic. Ask ChatGPT the same question twice and you might get different answers. There's no single 'correct' output to test against. The behavior is probabilistic, context-dependent, and influenced by thousands of training decisions made months earlier. Traditional testing frameworks were not designed for this.
The stakes have also changed. When ChatGPT gives a wrong answer in a conversation, it's annoying. When an AI agent — the kind Google, OpenAI, and Anthropic are building in 2026 — books the wrong flight, sends money to the wrong account, or takes an action the user didn't authorize, it's a real-world failure with real consequences.
Promptfoo's approach is adversarial. It attacks AI systems the way a hacker would — probing for prompt injections, jailbreaks, data leaks, and unauthorized actions. It asks the questions the AI's creators didn't think to ask.
We spent 79 years learning to test software that follows instructions. Now we need to test software that makes its own decisions.
This echoes the evolution of software testing itself. In the 1950s and 60s, testing was an afterthought — you wrote code and hoped it worked. Glenford Myers' 1979 book 'The Art of Software Testing' formalized the discipline. He argued that the purpose of testing is not to show that software works, but to find cases where it doesn't. Good testing is inherently destructive.
The same philosophical shift is happening with AI. The purpose of AI testing is not to prove the model is smart. It's to find the cases where it's dangerous, wrong, or manipulable.
Fernando Corbato, who invented the computer password at MIT in 1961, called passwords 'a nightmare' before he died in 2019. He understood that security measures create their own problems. AI testing will follow the same pattern — every layer of safety creates new edge cases, new failure modes, new bugs.
Grace Hopper's moth was a hardware bug with a hardware fix: remove the moth. OpenAI's bugs are architectural, philosophical, and potentially emergent. You can't tape them into a logbook.
The oldest question in computing — 'how do you know it works?' — hasn't changed since Ada Lovelace asked it in 1843. The answer, in 2026, is: we're not sure we do.
How did this make you feel?
Recommended Gear
View all →Disclosure: Some links on this page may be affiliate links. If you make a purchase through these links, we may earn a small commission at no extra cost to you. We only recommend products we genuinely believe in.
Framework Laptop 16
The modular, repairable laptop that lets you upgrade every component. The right-to-repair movement in action.
Flipper Zero
Multi-tool for pentesters and hardware hackers. RFID, NFC, infrared, GPIO — all in your pocket.
The Innovators by Walter Isaacson
The untold story of the people who created the computer, internet, and digital revolution. Essential tech history.
reMarkable 2 Paper Tablet
E-ink tablet that feels like writing on real paper. No distractions, no notifications — just thinking.
Keep Reading
Want to dig deeper? Trace any technology back to its origins.
Start Research