Chat GPT appears to hallucinate or outright lie about everything

Buttflapper@lemmy.world · 9 months ago

Chat GPT appears to hallucinate or outright lie about everything

snooggums@midwest.social · edit-2 9 months ago

All AI share a central design flaw of being what people think they should return based on weighted averages of ‘what people are saying’ with a little randomization to spice things up. They are not designed to return factual information because they are not actually intelligent so they don’t know fact from fiction.

ChatGPT is designed to ‘chat’ with you like a real person, who happens to be agreeable so you will keep chatting with it. Using it for any kind of fact based searching is the opposite of what it is designed to do.

quafeinum@lemmy.world · 9 months ago

It’s literally just Markov chains with extra steps

JackGreenEarth@lemm.ee · 9 months ago

Not all AIs, since many AIs (maybe even most) are not LLMs. But for LLMs, you’re right. Minor nitpick.

Zerlyna@lemmy.world · 9 months ago

Yes!!! It doesn’t know Trump has been convicted and told me that even when I give it sources, it won’t upload to a central database for privacy reasons. 🤷‍♀️

leftzero@lemmynsfw.com · 9 months ago

LLM models can’t be updated (i.e., learn), they have to be retrained from scratch… and that can’t be done because all sources of new information are polluted enough with AI to cause model collapse.

So they’re stuck with outdated information, or, if they are being retrained, they get dumber and crazier with each iteration due to the amount of LLM generated crap on the training data.

Ogmios@sh.itjust.works · 9 months ago

I wonder if you can get it to day anything bad about any specific person. Might just be that they nuked the ability entirely to avoid lawsuits.

Zerlyna@lemmy.world · edit-2 9 months ago

Once I give it links to what it accepts as “reputable sources” (npr, ap, etc.) it concedes politely. But I’m gonna try it now lol.

amelia@feddit.org · 9 months ago

based on weighted averages of ‘what people are saying’ with a little randomization to spice things up

That is massively oversimplified and not really how neural networks work. Training a neural network is not just calculating averages. It adjusts a very complex network of nodes in such a way that certain input generates certain output. It is entirely possible that during that training process, abstract mechanisms like logic get trained into the system as well, because a good NN can produce meaningful output even on input that is unlike anything it has ever seen before. Arguably that is the case with ChatGPT as well. It has been proven to be able to solve maths/calculating tasks it has never seen before in its training data. Give it a poem that you wrote yourself and have it write an analysis and interpretation - it will do it and it will probably be very good. I really don’t subscribe to this “statistical parrot” narrative that many people seem to believe. Just because it’s not good at the same tasks that humans are good at doesn’t mean it’s not intelligent. Of course it is different from a human brain, so differences in capabilities are to be expected. It has no idea of the physical world, it is not trained to tell truth from lies. Of course it’s not good at these things. That doesn’t mean it’s crap or “not intelligent”. You don’t call a person “not intelligent” just because they’re bad at specific tasks or don’t know some facts. There’s certainly room for improvement with these LLMs, but they’ve only been around in a really usable state for like 2 years or so. Have some patience and in the meantime use it for all the wonderful stuff it’s capable of.

SlopppyEngineer@lemmy.world · 9 months ago

It does remind me of that recent Joe Scott video about the split brain. One part of the brain would do something and the other part of the brain that didn’t get the info because of the split just makes up some semi-plausible answer. It’s like one part of the brain does work at least partially like an LLM.

It’s more like our brain is like a corporation, with a spokesperson, a president and vice president and a number of departments that with semi-independently. Having an LLM is like having only the spokesperson and not the rest of the work force in that building that makes up an AGI.

snooggums@midwest.social · 9 months ago

An LLM is like having the receptionist provide detailed information from what they have heard other people talk about in the lobby.

snooggums@midwest.social · 9 months ago

An LLM is like having the receptionist provide detailed information from what they have heard other people talk about in the lobby.

SuperSleuth@lemm.ee · 9 months ago

There’s no way they used Gemini and decided it’s better than GPT.

I asked Gemini: “Why can great apes eat raw meat but it’s not advised for humans?”. It said because they have a “stronger stomach acid”. I then asked “what stomach acid is stronger than HCL and which ones do apes use?”. And was met with the response: “Apes do not produce or utilize acids in the way humans do for chemical processes.”.

So I did some research and apes actually have almost neutral stomach acid and mainly rely on enzymes. Absolutely not trustworthy.

Daemon Silverstein@thelemmy.club · 9 months ago

use

I guess Gemini took the word “use” literally. Maybe if the word “have” would be used, it’d change the output (or, even better, “and which ones do apes’ stomachs have?” as “have” could imply ownership when “apes” are the subject for the verb).

gravitas_deficiency@sh.itjust.works · 9 months ago

The “i” in LLM stands for intelligence

Kazumara@discuss.tchncs.de · 9 months ago

It did not simply analyze the best type of graphics card for the situation.

Yes it certainly didn’t: It’s a large language model, not some sort of knowledge engine. It can’t analyze anything, it only generates likely text strings. I think this is still fundamentally misunderstood widely.

leftzero@lemmynsfw.com · 9 months ago

I think this is still fundamentally misunderstood widely.

The fact that it’s being sold as artificial intelligence instead of autocomplete doesn’t help.

Or Google and Microsoft trying to sell it as a replacement for search engines.

It’s malicious misinformation all the way down.

Christer Enfors@lemm.ee · 9 months ago

Agreed. As far as I know, there is no actual artificial intelligence yet, only simulated intelligence.

Oka@sopuli.xyz · 9 months ago

If I narrow down the scope, or ask the same question a different way, there’s a good chance I reach the answer I’m looking for.

https://chatgpt.com/share/ca367284-2e67-40bd-bff5-2e1e629fd3c0

aberrate_junior_beatnik@lemmy.world · 9 months ago

ChatGPT does not “hallucinate” or “lie”. It does not perceive, so it can’t hallucinate. It has no intent, so it can’t lie. It generates text without any regard to whether said text is true or false.

GetOffMyLan@programming.dev · 9 months ago

Hallucinating is the term for when ai generate incorrect information.

aberrate_junior_beatnik@lemmy.world · 9 months ago

I know, but it’s a ridiculous term. It’s so bad it must have been invented or chosen to mislead and make people think it has a mind, which seems to have been successful, as evidenced by the OP

GetOffMyLan@programming.dev · edit-2 9 months ago

At no point does OP imply it can actually think and as far as I can see they only use the term once and use it correctly.

If you are talking about the use of “lie” that’s just a simplification of explaining it creates false information.

From the context there is nothing that implies OP thinks it has a real mind.

You’re essentially arguing semantics even though it’s perfectly clear what they mean.

aberrate_junior_beatnik@lemmy.world · 9 months ago

OP clearly expects LLMs to exhibit mind-like behaviors. Lying absolutely implies agency, but even if you don’t agree, OP is confused that

It did not simply analyze the best type of graphics card for the situation

The whole point of the post is that OP is upset that LLMs are generating falsehoods and parroting input back into its output. No one with a basic understanding of LLMs would be surprised by this. If someone said their phone’s autocorrect was “lying”, you’d be correct in assuming they didn’t understand the basics of what autocorrect is, and would be completely justified in pointing out that that’s nonsense.

cheddar@programming.dev · 9 months ago

It’s incorrect to ask chatgpt such questions in the first place. I thought we’ve figured that out 18 or so months ago.

ABCDE@lemmy.world · 9 months ago

Why? It actually answered the question properly, just not to the OP’s satisfaction.

ramirezmike@programming.dev · 9 months ago

because it could have just as easily confidentiality said something incorrect. You only know it’s correct by going through the process of verifying it yourself, which is why it doesn’t make sense to ask it anything like this in the first place.

ABCDE@lemmy.world · 9 months ago

I mean… I guess? But the question was answered correctly, I was playing Beat Saber on my 1060 with my Vive and Quest 2.

ramirezmike@programming.dev · 9 months ago

It doesn’t matter that it was correct. There isn’t anything that verifies what it’s saying, which is why it’s not recommended to ask it questions like that. You’re taking a risk if you’re counting on the information it gives you.

ABCDE@lemmy.world · 9 months ago

Yes and no. 1060 is fine for basic VR stuff. I used my Vive and Quest 2 on one.

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 🏆@yiffit.net · edit-2 9 months ago

Imagine text gen AI as just a big hat filled with slips of paper and when you ask it for something, it’s just grabbing random shit out of the hat and arranging it so it looks like a normal sentence.

Even if you filled it with only good information, it will still cross those things together to form an entirely new and novel response, which would invariably be wrong as it mixes info about multiple subjects together even if all the information individually was technically accurate.

They are not intelligent. They aren’t even better than similar systems that existed before LLMs!

WolfLink@sh.itjust.works · 9 months ago

It’s actually not really wrong. There are many VR games you can get away with low specs for.

Yes when you suggested a 3070 it just took that and rolled with it.

It’s basically advanced autocomplete, so when you suggest a 3070 it thinks the best answer should probably use a 3070. It’s not good at knowing when to say “no”.

Interesting it did know to come up with a newer AMD card to match the 3070, as well as increasing the other specs to more modern values.

Petter1@lemm.ee · 9 months ago

For such questions you need to use a LLM that can search the web and summarise the top results in good quality and shows what sources are used for which parts of the answer. Something like copilot in bing.

emmy67@lemmy.world · 9 months ago

Or, the words “i don’t know” would work

SomeGuy69@lemmy.world · 9 months ago

People would move to the competition LLM that does always provide a solution, even if it’s wrong more often. People are often not as logical and smart as you wish.

Petter1@lemm.ee · 9 months ago

I don’t think LLM can do that very well, since there are very little people on the internet admitting that they don’t know about anything 🥸😂

Funny thing is, that the part of the brain used for talking makes things up on the fly as well 😁 there is great video from Joe about this topic, where he shows experiments done to people where the two brain sides were split.

https://youtu.be/_TYuTid9a6k?si=PylqvQ24QHWw_6PN

emmy67@lemmy.world · 9 months ago

Funny thing is, that the part of the brain used for talking makes things up on the fly as well 😁 there is great video from Joe about this topic, where he shows experiments done to people where the two brain sides were split.

Having watched the video. I can confidently say you’re wrong about this and so is Joe. If you want an explanation though let me know.

Petter1@lemm.ee · edit-2 9 months ago

Yes please! Hope you commented that on Joe‘s Video so he can correct himself in a coming video

r_se_random@sh.itjust.works · 9 months ago

The copilot app doesn’t seem to be any better.

r_se_random@sh.itjust.works · 9 months ago

Petter1@lemm.ee · 9 months ago

At least it gives you links to validate the info it serves you I’d say. LLM can do nothing about bad search results, the search algorithm works a bit different and is its own machine learning process.

But I just recognised, that chatGPT as well can search the web, if you prompt in the right way, and then it will give you the sources as well

r_se_random@sh.itjust.works · 9 months ago

But that also discredits me from ever asking an LLM a question which I don’t already know the answer to. If I have to go through the links to get my info, we already have search engines for it.

The entire point of LLM with Web search was to summarise the info correctly which I have seen them fail at, continuously and hilariously.

Petter1@lemm.ee · 9 months ago

Yea, but I prefer just writing what I am thinking instead of keywords. And more often than not, it feels like I get to answer more quickly as if I just used a search engine. But of course, I bet there are multiple people, that find stuff faster on web search engines, than me with LLM, it is just for me the faster way to find what I search.

elxeno@lemm.ee · 9 months ago

Did you try putting “do not hallucinate” in your prompts? Apparently it works.

webghost0101@sopuli.xyz · 9 months ago

This is an issue with all models, also the paid ones and its actually much worse then in the example where you at least expressed not being happy with the initial result.

My biggest road block with AI is that i ask a minor clarifying question. “Why did you do this in that way?” Expecting a genuine answer and being met with “i am so sorry here is some rubbish instead. “

My guess is this has to do with the fact that llms cannot actually reason so they also cannot provide honest clarification about their own steps, at best they can observe there own output and generate a possible explanation to it. That would actually be good enough for me but instead it collapses into a pattern where any questioning is labeled as critique with logical follow up for its assistant program is to apologize and try again.

Tellore@lemmy.world · 9 months ago

I’ve also had similar problem, but the trick is if you ask it for clarifications without it sounding like you imply them wrong, they might actually try to explain the reasoning without trying to change the answer.

webghost0101@sopuli.xyz · 9 months ago

I have tried to be more blunt with an underwhelming succes.

It has highlighted some of my everyday struggles i have with neurotypicals being neurodivergent. There are lots of cases where people assume i am criticizing while i was just expressing curiosity.

finitebanjo@lemmy.world · 9 months ago

For me it is stupid to expect these machines to work any other way. They’re literally designed such that they’re just guessing words that make sense in a context, the whole statement then assembled from these valid tokens sometimes checked again by… another machine…

It’s always going to be and always has been a bullshit generator.

QuentinQuiver@slrpnk.net · 9 months ago

You can use the RAG tactic to make it more useful. That involves starting with reputable sources as input, which creates an AI character that’s essentially supposed to be an expert in a certain topic.

The normal AI system is a scammer who tries to convince others to act like them… just like me and other internet trolls or crazy people. It needs some snark to act like a real person does, but pure snark is quite useless.

Essentially: nonsense in, nonsense out Or science books and journals in, sci fi speculation out

zbyte64@awful.systems · 9 months ago

RAG is a search engine that sometimes summarizes incorrectly and uses 10x the energy. Such a dumb product.

finitebanjo@lemmy.world · 9 months ago

No, again, because each word is a token which together makes a phrase and each phrase is a token that makes a statement. Since these Tokens are generated individually, it will never have any real underlying logic. It’s just sentence probability. Even if your sample data is free of nonsense, the LLM will still generate nonsense.

ipkpjersi@lemmy.ml · edit-2 9 months ago

I could have said anything, and then it would have agreed with me

Nope, I’ve had it argue with me, and I kept arguing my point but it kept disagreeing, then I realized I was wrong. I felt stupid but I learned from it.

It doesn’t “know” anything but that doesn’t mean that it can’t be right.