i am full of [???] and gay @seageweapon - Tumblr Blog

[source]

🚨BREAKING: OpenAI published a paper proving that ChatGPT will always make things up.

Not sometimes. Not until the next update. Always. They proved it with math.

Even with perfect training data and unlimited computing power, AI models will still confidently tell you things that are completely false. This isn't a bug they're working on. It's baked into how these systems work at a fundamental level.

And their own numbers are brutal. OpenAI's o1 reasoning model hallucinates 16% of the time. Their newer o3 model? 33%. Their newest o4-mini? 48%. Nearly half of what their most recent model tells you could be fabricated. The "smarter" models are actually getting worse at telling the truth.

Here's why it can't be fixed. Language models work by predicting the next word based on probability. When they hit something uncertain, they don't pause. They don't flag it. They guess. And they guess with complete confidence, because that's exactly what they were trained to do.

The researchers looked at the 10 biggest AI benchmarks used to measure how good these models are. 9 out of 10 give the same score for saying "I don't know" as for giving a completely wrong answer: zero points. The entire testing system literally punishes honesty and rewards guessing.

So the AI learned the optimal strategy: always guess. Never admit uncertainty. Sound confident even when you're making it up.

OpenAI's proposed fix? Have ChatGPT say "I don't know" when it's unsure. Their own math shows this would mean roughly 30% of your questions get no answer. Imagine asking ChatGPT something three times out of ten and getting "I'm not confident enough to respond." Users would leave overnight. So the fix exists, but it would kill the product.

This isn't just OpenAI's problem. DeepMind and Tsinghua University independently reached the same conclusion. Three of the world's top AI labs, working separately, all agree: this is permanent.

Every time ChatGPT gives you an answer, ask yourself: is this real, or is it just a confident guess?

trickstertime

The easiest way to conceptualise why LLMs are always lying is to think about it is as an autocorrect on steroids. Ya know how you can mash the middle suggested words on your phone keyboard and it sometimes sounds like something you would say? That's all these LLM 'AI' things are at the moment. The one in the autocorrect on your phone has been trained just on how you type on your phone, the LLMs they're talking about have been trained on all the text available online, hence the 'large' in 'large language model'. Its impressive but its still just guessing what the next word will be. I wouldn't even go so far as to say its lying or guessing, its not thinking or anything. They're not even like a mouse, or an insect. They're just plinko machines. The ball Plinks down through a multidimensional model and the pins it hits are words.

They can do fancy tricks of stacking them so that one guessing machine will look at the outputs of two or more guessing machines and then guess which one will probably be more correct. They call these one 'thinking models'. But in the end it's still just autocorrect on steroids.

LLMs are always lying in the way that a plinko machine will sometimes end up with the ball taking a weird route though the pins. Its not that its wrong sometimes, it's that we notice that the through line of pins that the ball happens to hit, and the words those pin line up with, doesn't line up with things we know to be true. Then we turn around and say the plinko is lying, like it has any agency.

trickstertime

To further expand on this, the pins in a plinko is actually a pretty good analogy for how LLMs understand what they're doing. Its not words to an LLM, its point in space. The points are labeled or numbered, and it knows that when it hits point 76362983, that usually point 668827 follows, after that it can choose 444449855 or 873893919, and then go to 66588873987. That's it deciding if it says 'its a safe mushroom' or 'its an unsafe mushroom'. Obviously it's more complex and nuanced than this, but it shows how an LLM fundamentally cannot know what its saying to us at a base level. Sure, we can adjust the positions of the pins so that the ball tends to fall in a way so that the output happens to line up with sentances we know are true, but at it's core, it's just a plinko we've made so complex that some of us trick ourselves into thinking is fully alive.

"Whohh, this horse plinko said it's in love with me."

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

easel-eisel

favourite rpg trope is the merchants in incredibly hostile environments. we are at the evil curse mountain and youre just selling me items normal style

cheeky-orchid

Essential worker during covid

wonderful-emoji

#claw in lovable claw :)

haaarpb

havent drawn a self portrait in a while

c-53

I have a bisexual guppy and its funny as hell to watch because it seems like he’s only bi out of desperation. Like all of the female guppies are unimpressed by him, and dont accept his mating displays, and every time he fails, he goes over to a SPECIFIC male guppy (the prettiest male guppy in the tank) like PLEASE PLEASE PLEASE PLEASE and that male guppy always lets him????

domnika-deactivated20250923

i would read that fanfic ngl

c-53

About my fish??

graceful-not

HELP?????

everyfandom-girl

YEAH THIS IS SCIENTIFICALLY RECORDED

WOMEN LOVE GAYBOY

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

mr-deep-downer

doumeki

byjove

I do not agree with veganism as a moral standard. If it is your personal moral stance, that is fine. If you think humans eating meat is inherently immoral, I don’t want to deal with you, you’re hopeless. Vegan ideology behaves more like a sect of evangelical Christianity than a dietary choice.

literallycrashingoutrn

Veganism is better for the environment, but claiming that it's a morally superior choice ignores cultural and economic factors that make people eat animal products.

byjove

It is not inherently better for the environment. That is the thing. When you begin trying to explain that local, sustainably sourced animal protein is better for the environment than imported plant proteins that are farmed 3,500 miles away using slave labor, they start tuning you out. Down is better for the environment than polyester stuffing, leather is better for the environment than pleather. We should work on making animal agricultural practices more sustainable instead of trying to shame everyone into eating plant products that are also farmed unethically and unsustainably.

meerschweinchen-archive

your twenties are Also about discovering that you’re not a bad person in all the ways you believed you were but you’re a bad person in completely new and exciting ways

nonetoon

Everyone meet just a normal goose :)

nonetoon

Glad you guys like this totally normal goose!

nonetoon

I am making everyone remember normal goose

nonetoon

Well, I can not find the original separate post of this so I’m just going to tack these on here

elodieunderglass

Thank you @glitterdustcyclops !

K

U

N

G

P

O

W

P

U

silly-gobbledygook-deactivated2

S

living-shifting-oil

S

asu-xu-namir

Y

star-mom-selkie

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

ouroborosorder

Shoutouts to the time I had a severe fever and took benedryl and wanted to listen to feel good inc but couldn’t remember the name. I think I was crying over this

currentlycryingaboutlancelot

realized i have started texting like mr darcy

plasmalogical

g4yr4t

I love Doris and she is so cute but also she looks so much like Teddy Roosevelt sometimes that I can't help but laugh

g4yr4t

all I'm saying is that no one has ever seen them in the same room together

tesghosterone

it's been said a thousand times but "male loneliness epidemic" is just our entire generation's loneliness PANdemic. it's just the crisis of capitalism. almost all young people these days are lonely and depressed because of the crisis and decay of the system we live under and tbh the centering of "male loneliness" in this discussion is embarrassing but also a really understandable consequence of sexism & this current stage of culture war . "Oh won't someone think of the young men" oh won't someone think about class consciousness lol

viralfrog

dragongirlbunny

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

gokusclit

im obsessed

gallusrostromegalus

A cat is a machine that turns proteins into violence.

starathsbunker

#Helios was declawed by his former owners so he doesn't just slap things he dislikes like most cats#he really only feels confident in hissing at them#Especially because a lot of the thing he doesn't like are bugs and those are sharp sometimes :(#Selene has figured this out and now when she hears him hiss she sprints over the kill the fuck out of the bug#Helios has learned she will do this so he'll hiss at stuff louder and louder until she hears him#A nervous old man and his emotional support homicidal maniac tags by @gallusrostromegalus

I couldn't reblog without the tags because the context is hilarious

gallusrostromegalus

A Nervous Old Man (right) and his Emotional Support Violence Machine (Left)

Yes, he is more than twice her size. Yes, he is five times her age. Yes, he cries like a big baby until she kills Unacceptable Scary Things (earwigs) for him.

Trending Blogs

Last Seen Blogs

i am full of [???] and gay