AI Has a Punctuation Addiction. It's Called the Em Dash, and You've Seen It a Thousand Times Today.
Every AI model overuses the em dash. Every single one. But the em dash is just the tip of the iceberg. From 'delve' to 'tapestry' to suspiciously perfect paragraph structure, AI leaves fingerprints everywhere. And AI detectors still can't reliably find them.
Key Takeaways
- •AI models use em dashes 3 to 10 times more frequently than human writers in equivalent text lengths
- •The em dash traces to medieval scribes (1100s) but was formalized in English typography around 1725
- •AI fingerprints beyond em dashes: 'delve,' 'tapestry,' 'landscape,' 'nuanced,' 'straightforward,' 'It's important to note'
- •OpenAI shut down its own AI text classifier in 2023 (26% true positive, 9% false positive rate)
- •Students and professionals are being falsely accused of AI use for writing clear, well-structured prose
Root Connection
The em dash traces back to medieval scribes who used a long horizontal stroke to mark pauses in manuscripts before modern punctuation existed. The typographic em dash (named because it's the width of the letter M in a given typeface) was formalized in the 1700s. For centuries it was a niche stylistic choice. Then in 2023, AI language models made it the most overused punctuation mark on the internet.
Timeline
Medieval scribes use long horizontal marks (virgula suspensiva) to indicate pauses in Latin manuscripts. The ancestor of the em dash.
Gutenberg's printing press standardizes punctuation. The period, comma, and colon take dominant roles. The dash remains informal.
The em dash is formalized in English typography. Named for the width of the capital M. Used sparingly by authors for dramatic interruptions.
Emily Dickinson uses dashes obsessively in her poetry. Her manuscripts contain thousands of dashes of varying lengths. Scholars debate their meaning to this day.
GPT-3 launches. Early users notice the model has an unusual fondness for em dashes, semicolons, and the word 'delve.'
ChatGPT launches. The em dash goes from niche punctuation to the most recognizable AI writing fingerprint on the internet.
Researchers at Stanford and Georgetown publish studies on 'AI linguistic fingerprints': consistent word choices, sentence structures, and punctuation patterns across all major models.
AI detection tools remain unreliable (sub-80% accuracy). Meanwhile, human writers who naturally use em dashes get falsely flagged. Students are punished for writing 'too well.'
If you've read anything written by AI in the last three years, you've seen it.
The em dash.
That long horizontal line that shows up in the middle of sentences, adding what feels like dramatic emphasis or a parenthetical aside. It looks like this: "AI is transforming industries, but the real question is whether society is ready for it."
Wait, I can't even show you one because this article isn't allowed to use them. (RootByte editorial policy. We banned them. Because AI ruined them.)
So let me describe what happens instead. You're reading a perfectly normal article, and suddenly the sentence takes a detour. Instead of using a comma, a colon, or parentheses, the author jams in a long dash. Then it happens again. And again. By the fourth paragraph, you realize every sentence has one. By the sixth paragraph, you're counting them. By the tenth paragraph, you know this was written by AI.
The em dash has become AI's most recognizable fingerprint.
But why? Why does every major language model, from ChatGPT to Claude to Gemini to Llama, overuse this one punctuation mark? And more importantly: if the em dash is the obvious tell, what are the invisible ones?
The em dash is a perfectly fine punctuation mark. Emily Dickinson loved it. So did David Foster Wallace. The problem isn't the em dash itself. The problem is that AI uses it the way a nervous speaker uses 'um': constantly, reflexively, in places where a comma, colon, or period would work better. When every other sentence contains an em dash, the text starts to read like a fingerprint.
— Bryte, Root Access
Let's start with the em dash itself.
The name comes from typography. An em dash is exactly one "em" wide, the width of the capital letter M in whatever typeface you're using. A shorter version, the en dash (the width of an N), is used for ranges like "2020 to 2025." The hyphen is the shortest dash of all, used for compound words like "well-known."
The em dash has a long history. Medieval scribes in the 1100s used long horizontal marks called "virgula suspensiva" to indicate pauses in Latin manuscripts. When Gutenberg's printing press standardized punctuation in the 1450s, the period, comma, and colon became dominant. The dash remained informal, a tool for hand-written notes and personal correspondence.
It wasn't until the 1700s that the em dash was formalized in English typography. Even then, it was considered a stylistic choice, not a standard punctuation mark. Most style guides (AP, Chicago, MLA) have specific rules about when to use em dashes, and all of them agree: use them sparingly.
Some human writers ignored that advice brilliantly. Emily Dickinson's manuscripts are famous for their dashes. She used thousands of them, in varying lengths, where other poets would use commas or periods. Scholars have debated for over a century whether her dashes represent pauses, musical notation, or something else entirely. David Foster Wallace used em dashes extensively in his fiction, often nesting them inside parentheses inside footnotes. For these writers, the em dash was a deliberate stylistic signature.
AI models don't use the em dash as a deliberate choice. They use it reflexively.
The reason is in the training data. Language models learn to write by ingesting billions of words from the internet, books, articles, and other text. The training data includes all kinds of writing styles, including published works by professional authors who use em dashes. But here's the key: the em dash shows up disproportionately in polished, edited, professional text. News articles, essays, opinion pieces, literary fiction. These are exactly the texts that get the most weight in training data because they're well-written.
So the model learns an association: good writing uses em dashes. Professional text uses em dashes. Confident, authoritative prose uses em dashes.
Then the model amplifies the pattern. Because AI optimizes for what looks "right" based on training data, it applies the em dash everywhere a comma, colon, semicolon, or period could work. It becomes a default connector. A verbal tic baked into the model's weights.
Here is the cruel paradox of AI detection in 2026: the tools punish humans who write well and miss AI that writes poorly. A student who naturally writes clear, structured prose gets flagged. A user who prompts ChatGPT to 'write like a tired college student with occasional typos' sails through undetected. We've built a system that punishes competence.
— Bryte, Root Access
Researchers have measured this. In studies comparing AI-generated text to human-written text of similar topics and lengths, AI models use em dashes three to ten times more frequently than human authors. ChatGPT is the worst offender. Claude is somewhat better (Anthropic has clearly tuned for it). Gemini falls in between. But all of them overuse it compared to the human baseline.
The em dash is the most visible AI fingerprint. But it's far from the only one.
Here's a field guide to the invisible tells.
Tell #1: The Vocabulary Cluster.
AI models have favorite words. Not random words. Specific, slightly unusual words that appear in AI output at rates wildly disproportionate to their frequency in human writing.
The most documented ones:
"Delve." AI loves this word. "Let's delve into the history of..." appears in AI-generated text roughly 10 to 50 times more often than in human writing. The word is perfectly valid English. Most humans just don't use it very often. We say "explore," "dig into," "look at," or "examine." AI says "delve."
"Tapestry." As in "a rich tapestry of culture and innovation." AI reaches for this metaphor constantly. Human writers use it occasionally. AI uses it in every third essay about diversity, history, or culture.
"Landscape." "The AI landscape," "the competitive landscape," "the evolving landscape." This word appears in AI text at roughly four times the human rate.
"Nuanced." "This is a nuanced issue." AI loves calling things nuanced. Ironically, the analysis that follows is rarely nuanced.
"Straightforward." "The solution is straightforward." Human writers say "simple," "easy," "obvious." AI says "straightforward."
"It's important to note that..." This phrase is the "um" of AI writing. It adds nothing. It just buys time before the actual point. Human writers who use this phrase get correctly flagged as boring. AI uses it as a structural crutch.
"In today's rapidly evolving..." Every AI essay about technology starts with some version of this. The "rapidly evolving landscape" of whatever topic is being discussed. It's the AI equivalent of a college freshman's "Since the beginning of time..." opening.
Tell #2: The Paragraph Structure.
AI writes in suspiciously uniform paragraph lengths. Human writers vary. Some paragraphs are one sentence. Some are twelve sentences. The variation is natural and unconscious.
AI tends to produce paragraphs of three to five sentences, consistently, throughout an entire piece. The rhythm becomes metronomic. Each paragraph makes one point, develops it briefly, and transitions to the next. It's competent. It's clear. It's also unnaturally regular.
Tell #3: The Hedging Pattern.
AI hedges constantly. "While there are certainly challenges..." "It's worth noting that both sides have valid points..." "This is not without its drawbacks, however..."
Human writers with opinions state them. AI qualifies everything. Every positive statement is followed by a "however." Every strong claim is softened by "it's important to consider." The result reads like a student who's been told they'll lose points for being too opinionated.
This happens because the models are trained (and fine-tuned via RLHF, reinforcement learning from human feedback) to be balanced, helpful, and non-controversial. The human raters who score AI outputs during training penalize strong opinions. So the model learns: hedge everything. Qualify everything. Never commit fully to a position.
Tell #4: The List Addiction.
AI loves numbered lists and bullet points. Ask it anything and there's a good chance you'll get "Here are five key points..." or "Consider the following factors..."
Human writers use lists sometimes. AI uses them compulsively. The list becomes a structural crutch, a way to organize information that avoids the harder work of building an argument through flowing prose.
Tell #5: The Compliment Opener.
Ask AI a question and it often responds with a compliment before answering. "Great question!" "That's a really interesting point!" "What a thoughtful observation!"
No human expert responds to a question by first complimenting the question. If you ask a doctor "Is this mole cancerous?" they don't say "What an insightful question about dermatology!" They just answer.
Tell #6: The Absence of Error.
This is the cruelest tell of all.
AI text is grammatically perfect. It doesn't make typos. It doesn't use sentence fragments on purpose. It doesn't break rules for effect. It doesn't misspell words because the author was typing fast. It doesn't have that one sentence that doesn't quite work but the author left it in because the deadline was tomorrow.
Human writing has texture. It has inconsistencies. It has personality expressed through imperfection. A human writer might start three consecutive sentences with "And" because they're building momentum. A human might write a one-word paragraph. A human might use slang, profanity, regional dialect, or inside jokes.
AI text is smooth. Perfectly smooth. And that smoothness is itself a signal.
This brings us to the real problem.
AI detection tools don't work. Not reliably.
OpenAI launched its own AI text classifier in January 2023. By July 2023, they shut it down. The reason: 26% true positive rate (it correctly identified AI text only a quarter of the time) and 9% false positive rate (it incorrectly flagged human text as AI nearly one in ten times).
If the company that built GPT can't detect GPT's output, the third-party tools (Turnitin's AI detector, GPTZero, Originality.ai, and others) are working with even less information. Studies have shown these tools produce false positive rates between 5% and 20%, depending on the writing style, topic, and language.
And here's where it gets truly unfair.
The people most likely to be falsely flagged are: non-native English speakers who write in formal, textbook-correct English (because it matches AI's "perfect" style), students who are naturally clear and well-organized writers, and professionals who have been trained to write in a polished, structured format.
In other words, writing well now makes you a suspect.
Students have been accused of cheating and failed assignments because their essays were "too good." Teachers using Turnitin's AI detector have confronted students with false accusations based on a percentage score that has no scientific basis for a threshold. The emotional damage of being told "this is too well-written to be yours" is real and lasting.
Meanwhile, someone who tells ChatGPT to "write this essay but make it sound like a tired college student, add some grammatical errors, use casual language, and include a few wrong facts" will sail through every detector without a flag.
The detection system punishes competence and rewards intentional mediocrity.
So what should you actually look for?
Forget the detectors. Here's what a thoughtful human reader can notice:
1. Does the text have a specific, personal perspective? Or does it read like an encyclopedia entry that could apply to anyone?
2. Does it contain specific examples from the writer's own experience? Not generic examples. Specific ones. "When I worked at a bakery in college..." is human. "Consider a hypothetical scenario where..." is AI.
3. Does it have opinions that could be wrong? Human writers take risks. They say things not everyone agrees with. AI hedges. Always.
4. Does the vocabulary feel natural or slightly elevated? If someone who normally texts "lol yeah that's wild" suddenly writes "the multifaceted implications of this phenomenon are indeed noteworthy," something changed.
5. Does it have texture? Inconsistent paragraph lengths, sentence fragments, moments of humor or frustration, a weird metaphor that doesn't quite land? Human. Smooth, uniform, polished throughout? Probably AI.
6. And yes, count the em dashes. If there are more than two per page in a piece that isn't literary fiction, raise an eyebrow.
The deeper issue here isn't detection. It's trust.
We're entering an era where the authenticity of written text can no longer be assumed. This has happened before. Photoshop made us stop trusting photos in the 2000s. Deepfakes made us stop trusting video in the 2020s. AI writing is doing the same thing to text.
The solution isn't better detection tools. It's a cultural shift in what we value. Writing that contains specific experience, genuine vulnerability, imperfect honesty, and a point of view that only one person could have: that's the writing that AI cannot replicate.
The em dash isn't the enemy. It's a perfectly good punctuation mark that got caught in the crossfire. The real enemy is the assumption that polished equals valuable and that volume equals insight.
Write something only you could write. Make it messy. Make it real. Make it yours.
And maybe skip the em dash. Just for a while.
(Sources: Stanford HAI Research, Georgetown CSET AI Writing Analysis, OpenAI AI Classifier Blog Post, Turnitin AI Detection Studies, Pew Research, Ars Technica, The Atlantic, Chicago Manual of Style, Emily Dickinson Archive at Harvard)
Enjoy This Article?
RootByte is 100% independent - no paywalls, no corporate sponsors. Your support helps fund education, therapy for special needs kids, and keeps the research going.
Support RootByte on Ko-fiHow did this make you feel?
Recommended Gear
View all →Disclosure: Some links on this page may be affiliate links. If you make a purchase through these links, we may earn a small commission at no extra cost to you. We only recommend products we genuinely believe in.
Framework Laptop 16
The modular, repairable laptop that lets you upgrade every component. The right-to-repair movement in action.
Flipper Zero
Multi-tool for pentesters and hardware hackers. RFID, NFC, infrared, GPIO - all in your pocket.
The Innovators by Walter Isaacson
The untold story of the people who created the computer, internet, and digital revolution. Essential tech history.
reMarkable 2 Paper Tablet
E-ink tablet that feels like writing on real paper. No distractions, no notifications - just thinking.
Keep Reading
Want to dig deeper? Trace any technology back to its origins.
Start Research