AI-Assisted Plagiarismβ the ugly elephant in the BG3 fandom
In my last post, I jested about an anomaly and hinted at copious AI use in certain long fics that I (and many others, in fact) noticed in the BG3 fanfic community with the support of data comparison. The post, unfortunately, didn't carry the right message to the target audience. After all, AI-generated text isn't even against AO3 ToSβthere's nothing much you can do.
Now normally, I'd take the L. Hide in my pillows and cry over the death of humor like a bitter old hag. Read some cute, comfort fics, maybeβ
"her hair⦠fell across her shoulders like liquid shadow."
Hmm, liquid. Shadow? You mean like, liquid eyeshadow? Wait, β¦hair? How does it work? Confused, I dig through the drawer for my "A Toilet Paper Guide to AI detection" (see appendix):
Lack of true contextual understanding manifested in senseless metaphorsβ¦ Not the whimsical ones, but the ones that seem "normal" but will fall into an uncanny valley if you chew on them for more than a secondβ¦
Oh sh*t. But the plot (broadly) makes sense, doesn't it? I mean, sure, it has some paragraphs with low information density that are repeating themselves, like you'd see in standard AI-generated contentβ
Low information densityβ¦ the prompt input is almost always much shorter than the output, and to bridge this gap, AI will have to repeat the safe default of given anchor points, AKA, the promptβ¦
Who's the author? Oh, hey! Aren't you one of the 'prodigies' who magically whipped out 150k+ words in two months and love repeating fancy verbs despite Writing 101? So I go read a few more pages of their other finished fic for old time's sake. BANG! Lightning strikesβ
Way too many similarities to another fic I know closely of. Putting on my tinfoil hat, I search for the other 'prodigies'.
One of them is gone. Deleted along with the account.
Interest piqued, more digging into the rabbit hole: the deleted fic was involved in a plagiarism case some time ago. Coincidence? I skim through the downloaded version againβ
It finally explains where the inconsistency comes from.
Toilet Paper Guide: Inconsistencies in writing voice, character portrayal, settings, and overall tone that are not intentional stylistic choicesβ¦
Tinfoil hat off. Time to spell it out loud and clear:
The AI abusers are here. And they are stealing from you.
Plagiarismβthe big "P" everyone hates but no one talks about.
Much like AI, plagiarism is another thing every writer vehemently hates but is reluctant to discussβthe stinky, ugly elephant in the room. But as much as I would love to believe in the integrity of human nature and deny its existence altogether, it is a matter of probability, not possibilityβin any reasonably sized fandom, plagiarism is bound to happen. By that, I mean more than just accidental plagiarism as an honest mistake.
So why do writers rarely discuss it? Why is calling out plagiarism often shunned by the community instead?
Here's a simple theory: we all harbor, to some extent, an irrational fear of being falsely accused of plagiarism. After all, there are only so many tropesβWhat if?
To overcome this fear, I urge you to think about another question instead: If you select 15 prompts out of all 30 from Kinktober, how many unique combinations can you make?
I will save you the headache of Probability 101 with the answer straight away:
155 million (155,117,520, or 30C15) unique combinations.
Of course, in real life, not all kinks are loved equally, yada yada⦠But, well, you get the idea. In a long fic, the chances of recurring resemblance in dialogue, scenes, choreographies, and tropes are infinitesimal. What you consider clichés are much more unique than you think.
"But I don't see it!"β AI-assisted plagiarism in action.
Then comes the second question: if plagiarism exists, why don't I see it? Why, despite being a competent writer, do I not see the similarity, even with evidence placed side by side?
42. Wait, no, wrong handbook.
The answer is pretty clichΓ© and probably the only competitor to plagiarism itself if there were a most-hated-by-writers contest:
AI-assisted plagiarizers, who chop apart original content and stitch it back together with additional prompts. And they do it in ways so grotesque that the outcome will be barely recognizable.
Before you frown at my suggestion, take a look at the following demonstrationβ
I freeze. There he is, mere inches from me, so close that the cool breeze of his breath brushes my skin. My heart drums as his slender fingers reach for me, peeling away dried blood from the madness of last night that still coats my face. And when his sharp, crimson eyes lock onto mine and those pale, tender lips part, I nearly faint at his whisperβ
βOompa Loompa, you are mine.β
Tav stilled as if the world itself had ceased to move. He was nearβtoo nearβhis breath brushing her skin like frost, and her heart drummed a muted rhythm she could not master. His fingers found her face with quiet precision, brushing aside the brittle traces of last nightβs ruin, as though uncovering something only he had the right to see. It was not kindness, but possession; not mercy, but a claim written in touch.
His eyesβcrimson, unrelentingβfastened upon hers, and for a moment she felt caught within their hold, unable to turn away. When his lips parted, pale and careful, the hush between them tightened, and the sound of her own pulse seemed deafening in her ears.
The whisper that followed bound her in place, soft as a vow yet heavy as iron:
β You areβ¦ mine. β
Is snippet B a copycat of snippet A? Hmm. It's iffy. Thereβs some similarity in choreography and word choices, but there are so many differences! It's past tense, not present tense; close-third, not first-person. And there is no... Oompa Loompa.
I wrote snippet A myself. I prompted an AI model to write snippet B, by asking it to rewrite A into a third-person past tense, using the name 'Tav' and expanding the paragraph in a dark, atmospheric style.
Now, do you see the plagiarism?
To clarify, I am NOT suggesting that writings with snippet B's style are AI, nor claiming that any two snippets with this level of similarity are plagiarism.
My only goal is to explain why you might have stared into an AI-assisted case of plagiarism and said, "I don't see it."
How to spot it?βA Toilet Paper Guide to AI Detection. [Appendix]
Things're looking bleak, aren't they? The doomsayer is at it again, first claiming AI-assisted plagiarism exists, and next nullifying your past experiences without offering any solutions.
As obnoxiously 'tHiS iS AI' as it may sound, you actually have a better chance of detecting AIGC than plagiarism as a casual reader, and here is how:
First, take out a piece of toilet paper, and write down every easy AI telltale sign you have learnedβthe em-dash, the 'delve', the bullet points, the AI purple proseβ¦
Done? Flush it away. You won't need it.
The methods are not only unreliable and easy to bypass by tweaking prompts, but they also incur excessive false positives that harm genuine writers. All approaches I am about to introduce are grounded in the inherent flaws of LLM as I understand it, and are nearly impossible to fix by prompt engineering alone. While they are not foolproof and are meant for general reference only, I find them helpful as guides that rarely (if ever) conjure false positives.
Inconsistencies in writing voice, character portrayal, settings, and overall tone that are not intentional stylistic choices. Admittedly, it is hard to identify beyond a "feeling," and it probably won't happen in a fully AI-generated text fiction.
But here's the catch: a complete AIGC fic tastes like regurgitated vomit after an all-night party. And the only way to hide the stench is by either increasing the prompt-output ratio (write more yourself!) or feeding it other people's work for "inspiration":
One minute, Tav is "a timid virgin", the next "an eager sl*t who can't wait to take the D?" One minute, they are on a carpet, and the next it becomes a marble floor with zero explanation? That's an inconsistency coming from AI plagiarism.
Low information density, but note the difference between verbosity as a choice versus outright AI jibberish.
AI-assisted plagiarism is rarely a one-to-one copy, and where the abusers cannot steal, they "write" with AIβby prompt-based content generation. Here comes the pitfall: the input is almost always much shorter than the output, and to bridge this gap, AI will have to repeat the safe default of given anchor points, AKA, the prompt.
Let's face it, AI abusers don't write 500 words themselves for a 600-word output. If you find yourself reading the same paragraph phrased differently for the third time in one chapter, and it's not from a college student struggling to meet minimum word counts, keep an eye out.
Lack of true contextual understanding manifested in senseless metaphors. Even the most avid AI zealot will agree: current AI is far from real intelligence. Rather than understanding words (or tokens, really), it creates an illusion of comprehension through probability prediction.
And in the domain of creative text generation, what you'll notice in the output is a dubious amount of unusual metaphors. Not the whimsical onesβthose are surprisingly very human; the ones that seem "normal" but will fall into an uncanny valley if you chew on them for more than a second.
The words slipped from Astarionβs mouth in a breathless groan, his voice thick with restraint, awe, and starvation.
(What do you mean by 'thick with starvation' being a weird metaphor? I am thick! I β¦ starve! I groan, and I sound ... thick with starvation.)
Overrepresentation of fancy words. Wait a minute, didn't I just say you shouldn't judge based on the presence of certain words?
The difference is between a single occurrence and repeated offenses. It happens in AIGC, as I understand, because of the feedback loop in its reward model training stage, where human labelers' preferences are reinforced repeatedly, leading to a lexical overrepresentation in the final output.
The last point is the lowest-hanging fruit and the easiest to quantify, as I demonstrated in this SATIRE post. So far, 'curl', 'burn', 'tremble', 'whisper', and 'murmur' are the most common culprits. Again, not foolproof, only effective in longer contexts, but good enough as a signal to watch out for, and unlikely to happen in human writings.
I strongly advise against searching for the aforementioned fics based on the filters from my previous post How To Gain Traction On AO3 FAST /s. But if you happen to know or guess correctly, you'll see they check all the boxes. And if you are a writer, you could probably go through your fics with similar tags to the offenders and see if there is overlap. Who knows.
I'll admit, anger prompts me to make this post, as any writer rightfully will be against blatant AI-assisted plagiarism. I donβt seek your sympathy, nor your "picking my side" should a wave of self-victimizing deflection posts flood every BG3 community.
But they will take your taciturnity for ignorance and your tolerance for acquiescence. They could steal from you, me, and anyone in the fandom.
If you have, even just for a tiny moment, echoed with the sentiment, seen through the insanity, or thought "that actually makes sense"β
Privately to your close confidants, to your trusted circles, to anyone equally outraged by AI use and AI-assisted plagiarism. And if you're feeling bold, reblog it. Spread the message. Make it known.
Since channeling energy from big names seems to be what the cool kids are doing these days, I will end the post with a quote: