Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anybody who has ever done arithmetic knows that human logic is more than just statistical pattern prediction. Humans can learn and execute algorithms in precise, deterministic ways, which current LLM's fundamentally cannot do.


That's funny. So being intelligent is executing algorithms deterministically and being a machine is making mistakes and being non deterministic?

How the tables have turned


I’m not sure that’s true. Human make mistakes performing math all the time. Human’s ability to accurately perform multiplication is very much a statistical phenomenon. Now you probably dismiss those mistakes as different than the sort of thing that a llm does when following a rigid set of instructions, but to me they seem different in degree but not kind


Why cant LLM's fundamentally execute in a deterministic way? Its a computer running computation on some fixed data. Without randomization parameters for e.g. temperature it would be pretty deterministic?

My understanding is "tech enthousiast" level, so happy to learn.


So for LLMs like ChatGPT, one issue with doing arithmetic is that the input is tokenised, so it doesn't "see" the individual digits in numbers. That will make it harder for it to learn addition, multiplication etc. You can see what the inputs to the model might look like here: https://platform.openai.com/tokenizer

So for example, the text "123456789" is tokenised as "123", "45", "67", "89", and the actual input to the model would be the token IDs: [10163, 2231, 3134, 4531]. Whereas the text "1234" is tokenised as "12", "34" with IDs [1065, 2682]. So learning how these relate in terms of individual digits is pretty hard, as it never gets to see the individual digits.


I think to extend on the question, though, the fundamental answer is "There is nothing stopping the LLM from containing the embedding of all basic math", with the proviso that tokenization makes it vanishingly unlikely (perhaps in the current generation, or within reasonable resource limits).

I see it analogous to asking a human why they don't just "learn all the answers to simple arithmetic involving integers below 10,000" - you possibly could, it would just be a huge waste of time when you can instead learn the algorithm directly. Of course, LLMs are inherently a layer on top of an existing system which solves those problems quite well already, so it'd be somewhat silly there too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: