Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At this point the single biggest improvement that could be made to GPTs is making them able to say "I don't know" when they honestly don't.

Just today I was playing around with modding Cyberpunk 2077 and was looking for a way to programmatically spawn NPCs in redscript. It was hard to figure out, but I managed. ChatGPT 5 just hallucinated some APIs even after doing "research" and repeatedly being called out.

After 30 minutes of ChatGPT wasting my time I accepted that I'm on my own. It could've been 1 minute.



Don't make the mistake of thinking that "knowing" has anything to do with the output of ChatGPT. It gives you the statistically most likely output based on its training data. It's not checking some sort of internal knowledge system, it's literally just outputting statistical linguistic patterns. This technology can be trained to emphasize certain ideas (like propaganda) but it can not be used directly to access knowledge.


> It's not checking some sort of internal knowledge system

In my case it was consuming online sources, then repeating "information" not actually contained therein. This, at least, is absolutely preventable even without any metacognition to speak of.


Yep, that's a great point. They often feel like a co-worker who speaks with such complete authority on a subject that you don't even consider alternatives, until you realise they are lying. Extremely frustrating.


That is extremely hard, It require the model to have "knowledge" so it can decide if it know the answer or not which is not how current llm/gpt works


It doesn't "know" anything. Everything that comes out is a hallucination contingent on the prompt.


You could say the same about humans. Have you ever misremembered something that you thought you knew?

Sure, typically we don’t invent totally made up names, but we certainly do make mistakes. Our memory can be quite hazy and unreliable as well.


Humans have a direct connection to our world through sensation and valence, pleasure, pain, then fear, hope, desire, up to love. Our consciousness is animal and as much or more pre-linguistic as linguistic. This grounds our symbolic language and is what attaches it to real life. We can feel instantly that we know or don't know. Yes we make errors and hallucinate, but I'm not going to make up an API out of the blue; I'll know by feeling that what I'm doing is mistaken.


It's insane that this has to be explained to a fellow living person. There must be some mass psychosis going on if even seemingly coherent and rational people can make this mistake.


I mean, I've certainly made that mistake, comparing machines and people too closely, and then somehow had at least some of the errors pointed out.


We’re all prone to anthropomorphizing from time to time. It’s the mechanizing of humans that concerns me more than the humanizing of these tools, those aren’t equivalent.


Perception and understanding are different things. Just because you have wiring in your body to perceive certain vibrations in spacetime in certain ways, does not mean that you fully grasp reality - you have some data about reality, but that data comprises an incomplete, human-biased world model.


Yeah we'll end up on a "yes and no" level of accord here. Yes I agree that understanding and perception aren't always the same, or maybe I'd put it that understanding can go beyond perception, which I think is what you mean when you say "incomplete." But I'd say, "Sorry but no, I respectfully disagree" in that at least from my point of view, we can't equate human experience with "data" and doing so, or viewing people as machines, cosmos as machine, everything as merely material in a dead way out of which somehow springs this perhaps even illusion of "life" that turns out to be a machine after all, this kind of view risks extremely deep and dangerous -- eventually even perilous -- error. As we debated this, assuming I'm not mischaracterizing your position but it does seem to lead in that direction, I'd shore up my arguments with support from phenomenologists, I'd try to use recent physics of various flavors though I'm very very much out of my depth here but at least enough to puncture the scientific materialism bias, Wittgenstein, from the likes of McGilchrist and neuro and psychological sources, even Searle's "Seeing Things as They Are" which argues that perception is not made of data. I'd be against someone like a Daniel Dennett (though I'm sure he was a swell fellow) or Richard Dawkins. Would I prevail in the discussion? Of course I'm not sure, and realize now that I might, in LLM style, sound like I know more than I actually do!


Humans do many things that are not remembering. Every time a high school geometry student comes up with a proof as a homework exercise, or every time a real mathematician comes up with a proof, that is not remembering; rather, it is thinking of something they never heard. (Well, except for Lobachevsky--at least according to Tom Lehrer.) The same when we make a plan for something we've never done before, whether it's a picnic at a new park or setting up the bedroom for a new baby. It's not remembering, even though it may involve remembering about places we've seen or picnics we've had before.


Do you genuinely believe that humans just hallucinate everything? When you or I say my favorite ice cream flavor is vanilla, is that just a hallucination? If ChatGPT were to say their favorite ice cream flavor is vanilla, are you taking it with equal weight? Come on.


I genuinely believe that human brains are made of neurons and that our memories arise from how those neurons connect. I believe this is fundamentally lossy and probabilistic.

Obviously human brains are still much more sophisticated than the artificial neural networks that we can make with current technology. But I believe there’s a lot more in common than some people would like to admit.


Ok, that is memory. I am talking about hallucination vs human or even animal intent in an embodied meaningful experience.


All you’re doing is calling the same thing hallucination when an LLM does it and memory when a human does it. You have provided no basis that the two are actually different.

Humans are better at noticing when their recollections are incorrect. But LLMs are quickly improving.


So when I tell you I like vanilla ice cream I am just hallucinating and calling it a memory? And when chatgpt says they like vanilla ice cream they are doing the same thing as me? Do I need to prove it to you that they are different? Is it really baseless of me to insist otherwise? I have a body, millions of different receptors, a mouth with taste buds, I have a consciousness, a mind, a brain that interacts with the world directly, and it's all just words on a screen to you interchangeable with a word pattern matcher?


I’m not calling what you’re doing a hallucination. I’m saying that what an LLM does is in fact memory.

But it’s a memory based on what it’s trained on. Of course it doesn’t have a favorite ice cream. It’s not trained to have one. But that doesn’t mean it has no memory.

My argument is that humans have fallible memories too. Sometimes you say something wrong or that you don’t really mean. Then you might or might not notice you made a mistake.

The part LLMs don’t do great at is noticing the mistake. They have no filter and say whatever they’re thinking. They don’t run through thoughts in their head first and see if they make any sense.

Of course, that’s part of what companies are trying to fix with reasoning models. To give them the ability to think before they speak.


Can you just train one to have a favorite ice cream? You think training on a bunch of words saying I like vanilla ice cream is somehow equivalent to remembering times you ate ice cream and saying my favorite is vanilla? Just because an LLM can do recall when prompted to based on training data doesn’t make it the same as human memory, in the same way a database isn’t memory the way humans do it.


Can we please stop with the “same for humans!”


The whole point of AI is to replicate human intelligence. What else should we be comparing it to if not humans?


(Unlike machines trying to replicate visual systems) LLMs don't hallucinate : they bullshit.


> At this point the single biggest improvement that could be made to GPTs is making them able to say "I don't know" when they honestly don't.

You're not alone in thinking this. And I'm sure this has been considered within the frontier AI labs and surely has been tried. The fact that it's so uncommon must mean something about what these models are capable of, right?


Yes, there are people working on this, but not as many as one would like. GPTs have uncertainty baked into them, but the problem is that it's for the next-token prediction task and not for the response as a whole.


They do talk about working on this, and making improvements. From https://openai.com/index/introducing-gpt-5/

> More honest responses

> Alongside improved factuality, GPT‑5 (with thinking) more honestly communicates its actions and capabilities to the user—especially for tasks which are impossible, underspecified, or missing key tools. In order to achieve a high reward during training, reasoning models may learn to lie about successfully completing a task or be overly confident about an uncertain answer. For example, to test this, we removed all the images from the prompts of the multimodal benchmark CharXiv, and found that OpenAI o3 still gave confident answers about non-existent images 86.7% of the time, compared to just 9% for GPT‑5.

> When reasoning, GPT‑5 more accurately recognizes when tasks can’t be completed and communicates its limits clearly. We evaluated deception rates on settings involving impossible coding tasks and missing multimodal assets, and found that GPT‑5 (with thinking) is less deceptive than o3 across the board. On a large set of conversations representative of real production ChatGPT traffic, we’ve reduced rates of deception from 4.8% for o3 to 2.1% of GPT‑5 reasoning responses. While this represents a meaningful improvement for users, more work remains to be done, and we’re continuing research into improving the factuality and honesty of our models. Further details can be found in the system card.


I totally agree. That would be great. I think the problem with that is LLMs don’t know what they don’t know. It’s arguable they even “know” anything!


They're like some of the overconfident people I've worked with who are too insecure to say they don't know to our boss.


I just ran evaluations of gpt-5 for our RAG scenario and was pleasantly surprised at how often it admitted “ I don’t know” - more than any model I’ve eval’d before. Our prompt does tell it to say it doesnt know if context is missing, so that likely helped, but this is the first model to really adhere to that.


"I do not know" is rarely in the training data as a follow up to anything.


Yeah I'm surprised that there's not at least some sort of conviction metric being outputted along the LLM response.

I mean it's all probability right? Must be a way to give it some score.


Not sure. In RLHF you are adjusting the weights away from wrong answers in general. So this is being done.

I think the closest you can get without more research is another model checking the answer and looking for BS. This will cripple speed but if it can be more agentic and async it may not matter.

I think people need to choose between chat interface and better answers.


Like the XKCD reference but bigger: Give me a 100bn research team and 25 years.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: