0:00
/

91 Percent Human: The Shy Girl AI Scandal

When I was nineteen, I wrote a poem about flowers that never blossomed. It was about roots and soil and the things we pull from the ground before they have a chance to grow. It is, to this day, one of my favorite and most personal things I have ever written.

Yesterday, I uploaded it to an AI detection tool called Originality.ai. To my surprise, the overall scan came back 91 percent human. Then, when I pulled up the individual line scores (words I had written as a teenager, years before a commercial large language model even existed) were flagged at 70 percent probability of being AI-generated.

Somewhere inside an algorithm trained to sort the real from the synthetic, my nineteen-year-old voice did not register as human enough. It’s possible that nineteen-year-old me probably would have agreed. But nineteen-year-old-me wasn’t staking a career on whether an algorithm believed her.

Shy Girl

Mia Ballard wrote a horror novel called Shy Girl and self-published it in February 2025 with no agent and no deal, just the book on Amazon. The horror community found it on TikTok, and it steadily gathered momentum in the way that grassroots literary success still occasionally happens. Almost five thousand readers rated it on Goodreads. Reviewers called it corrosive but addictive, dark and visceral. For a self-published debut, it was doing something rare. It was finding its people.

Hachette, one of the five largest publishers in the world, acquired it. They assigned an editor, designed new covers, scheduled a UK release for November and a US release the following May, sent advance copies to established reviewers, and built an entire marketing campaign around a book that had started as one woman’s solo bet on herself.

Then the internet decided it was written by AI.

A nearly three-hour YouTube video titled I’m pretty sure this book is AI slop accumulated 1.2 million views. A Reddit thread from someone claiming to be a veteran book editor went viral, cataloguing what they identified as hallmarks of AI-generated prose. An AI detection company called Pangram scanned the full text and returned a damning number: 78 percent AI-generated. The New York Times picked up the story. Within 24 hours of the article’s publication, Hachette canceled the book, pulled it from every retailer, and issued a carefully worded statement about their enduring commitment to human creativity.

Ballard sent an email to the Times late on a Thursday night. She said she didn’t use AI. She said an acquaintance she had hired to edit the self-published version had used it without her knowledge. She said she was pursuing legal action and could not say more.

Her name is now permanently attached to this story.

Built on Theft

I went looking for the Pangram report. The CEO, Max Spero, had posted it publicly with a link on X. I started scrolling through the flagged sections, the passages his tool had identified as AI-generated, and I kept seeing something recurring in the text.

A URL was embedded directly in the document over and over: OceanofPDF.com.

If you are not familiar, OceanofPDF is a well-known piracy site that illegally distributes copyrighted books as downloadable PDFs. So let me get this straight. The 78 percent figure that the New York Times reported, the number that appears to have informed the decision to cancel Ballard’s contract and pull her book from every shelf it sat on, was generated from a text that was not the original manuscript and not the editorially vetted Hachette edition. Based on what is visible in the publicly available report, it appears to have been a pirated copy scraped from an illegal file-sharing site.

Nobody has reported this. Not the Times, not the trade press, not a single voice in the discourse that has been raging for days about the sanctity of authentic creative work.

A woman’s career was functionally ended by a test result that, based on the available evidence, appears to have been generated from a stolen copy of her own book. In a story about theft. The evidentiary foundation of one of the most consequential publishing decisions in recent memory appears to have been built on stolen property, and somehow that irony has gone completely unnoticed. Pirated PDFs routinely go through OCR scanning, format conversion, and layers of text processing that can introduce artifacts long before they exist as readable documents. Whether any of that processing altered the text before it was fed into the algorithm is a question that apparently did not occur to anyone involved. The person who generated that number, posted it publicly, and watched it become the centerpiece of a New York Times investigation appears to have been working from someone else’s stolen work.

The entire controversy is about the theft of creative labor. The proof was built on theft of creative labor.

The Drey Dossier was built on non-stolen creative labor. If you want to support, please consider subscribing to the Rough Riders Tier!

Taking Without Asking

Based on publicly available information, Hachette was aware of stolen creative work before any of this happened.

Before the AI allegations ever surfaced, readers discovered that Ballard’s original self-published cover featured an image cropped from a painting called Dreamer by Whyn Lewis, an artist known for emotionally striking portraits of whippets and the daughter of folk artist Vashti Bunyan. Ballard had reportedly found the image floating around Pinterest and used it without permission. By the time Hachette acquired the book, the theft was already public knowledge.

Their response, according to available accounts, was to commission entirely new covers that carefully recreated the mood and feeling of Lewis’s original painting. Without, as far as anyone can tell, ever involving Lewis herself.

A publisher that publicly positions itself as a fierce defender of creative rights responded to a documented case of artistic theft by working around the artist whose work was stolen. And every entity in this story is operating inside that same collapsed logic, where taking without asking has been normalized so thoroughly that the lines between inspiration and extraction have genuinely blurred. The largest AI companies trained their models on writers who never consented. The institutions publishing those models’ outputs issue statements about human creativity while automating their own acquisitions pipelines. A debut author pulled an image off Pinterest because that is what the internet has taught all of us is normal.

That normalization happened because there are no meaningful consequences at any level.

Or rather, there are consequences. They just land on the artist. Every single time. The companies keep building. The publishers keep publishing. And the person who made the thing, the person at the bottom of the chain with the smallest advance and the least institutional protection, is the one who loses everything.

In this case, that person is a Black woman trying to break into an industry that already has a well-documented history of giving debut authors of color smaller advances, less marketing support, and less of the institutional scaffolding that keeps you safe when something goes wrong. The authors who survive moments like this are the ones the industry was already invested in protecting. Mia Ballard had none of that when this hit. And while that fact is being treated as a footnote in the coverage of this story, it deserves to be closer to the headline.

What Human Sounds Like

There is also a body of research around AI detection and language bias that I think is criminally under-discussed, and most people in this conversation have either never encountered it or don’t realize how directly it applies here.

Researchers at the Max Planck Institute for Human Development recently analyzed hundreds of thousands of video transcriptions and found something unsettling: humans are unconsciously absorbing AI linguistic patterns into their own everyday speech, completely independent of whether they personally use AI to write. The feedback loop is disarmingly simple. Humans write online. AI trains on that writing and develops a recognizable cadence. That cadence saturates the internet through millions of outputs. Humans absorb it into their own voices without realizing it. AI trains on the blended corpus all over again. The styles are converging in both directions simultaneously, which means that detection tools calibrated against an earlier moment in that loop grow less reliable with every passing month, because the very baseline of what human writing sounds like is itself drifting toward what AI output sounds like.

A Stanford study found that AI detection tools flag writing by non-native English speakers as AI-generated at significantly higher rates than writing by native speakers, because the tools were trained on a narrow norm of standard American English. A separate study published in Nature found that the language models underlying these detection tools carry measurable bias against African American English, bias that the researchers described as more negative than any human stereotypes about African Americans ever experimentally recorded.

I am not making a claim about how Mia Ballard writes. I am saying that the tools themselves have demonstrated, peer-reviewed patterns of racial and linguistic bias baked into their architecture. When tools with that track record are used to inform a career-ending decision, and the person on the receiving end is a Black woman who was already navigating an industry with its own well-documented equity problems, the compounding of those failures deserves considerably more scrutiny than it has received. To my knowledge, the methodology has not been publicly disclosed, and there has been no public indication that Ballard was offered a meaningful right of response before the decision was announced. That should bother everyone, regardless of what they believe about the book.

No One Is Coming

On the same day that Ballard’s story broke, the White House released its formal legislative recommendations to Congress for governing artificial intelligence in the United States. I need to talk about this document, because every question this essay has been asking, every failure of accountability, every gap in protection, this document is the answer to all of it. And the answer, built deliberately into its architecture, is: nobody is responsible, nobody will be held accountable, and nobody is coming to help.

The document states that training AI on copyrighted material does not constitute a violation of copyright law, then politely suggests that the courts should be the ones to resolve the question. Let that framing sink in. The writers and artists whose work was scraped to train the models at the center of this entire controversy are told to wait for the courts. In copyright litigation, that means a decade at minimum. The industry gets to keep building. The people it was built on get to keep waiting.

It recommends that Congress should not create any new federal rulemaking body with authority over AI. Aviation has the FAA. Pharmaceuticals have the FDA. Nuclear energy has the NRC. Each of those industries, when it became powerful enough to reshape the lives of ordinary people, was given a dedicated federal body with the authority to oversee it in the public interest. This document asks Congress to ensure, by statute, that AI never receives an equivalent. Not temporarily. Permanently. The fastest-moving and most disruptive technology of our lifetime, and the official recommendation is that no one should be in charge of watching it.

Then there is the preemption provision, which closes every remaining door. States cannot regulate AI development within their borders. Developers cannot be held liable for what third parties do with their models. The federal government cannot act because there is no regulatory body. The states cannot act because they are preempted. There is no floor beneath anyone. Which means California, where Mia Ballard lives, cannot pass a law requiring evidentiary standards before a publisher can cancel a contract based on a detection scan. It cannot require those tools to be independently audited for accuracy or bias. It cannot create a right of appeal for someone whose entire career was ended by an algorithm operating on a pirated PDF. If this framework passes, no state can. That is the point.

This dropped the same day as her story. While the internet was busy debating whether an artist should be punished for allegedly using AI, the legislative architecture ensuring that no artist would ever be meaningfully protected from the institutions that profit from it was being handed to Congress without fanfare. And if you are wondering why I spent this entire essay walking through one woman’s story in this much detail, it is because this is what it looks like when there is no protection. This is the cost, in real time, borne by a real person. And the proposal on Congress’s desk right now is designed to make sure it stays that way.

The Garden

Which brings me back to the diagnosis I received. Ninety-one percent human. Nine percent something else. Nine percent that apparently belongs to a machine that learned how to write by reading people like me, and now gets to turn around and claim that I sound like it.

My poem was about flowers that never blossomed, and soil that held things in place long after they stopped growing. I wrote it before any of this existed. Before the tools, before the discourse, before anyone had to prove that the thing they made came from them. And yet somewhere in that algorithm, the most private and unfinished part of my nineteen-year-old self got sorted into a category that did not even exist when I put those words on paper.

Mia Ballard sent an email to a reporter at the New York Times late on a Thursday night. She said her mental health was at an all-time low. She said her name had been ruined over something she did not do. Whether or not you believe her, consider what it means that we keep ending these conversations at the artist, and that the artist is almost always the person with the least power and the least protection in the entire chain. The crowd does the investigative work that the institution never bothered to do. The institution collects the press release about its commitment to human creativity. And the artist absorbs everything else.

The regulatory window on AI is closing. Once it does, flowers will keep getting diagnosed as weeds. And no one will be coming to replant them.


SOURCES CITED

Discussion about this video

User's avatar

Ready for more?