Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, the AI effect is real. As soon as computers can do a thing it’s no longer “AI”.

But I don’t think this is a nitpick at all. GPT models hallucinate information. They are right surprisingly often, but they’re also wrong quite often too. And the problem is they are just as confident in either case.

This is a fundamental, irreconcilable issue with statistical language models. They have no grounding in auditable facts. They can memorize and generate in very plausible ways but they don’t seem to have a concrete model of the world.

Ask ChatGPT to play chess. It can generate a text based board and prompt you for moves, but it can’t reliably update its board correctly or even find legal moves. Note that I don’t expect it to play good moves, but the fact that it can’t even play legal moves should tell us something about its internal state.

Now that GPT3 has trained on the whole internet, we may have reached a practical limit to how far you can get by simply training on more data with 1 or 2 orders of magnitude more parameters. There’s only so far you can get by memorizing the textbook.

At a more practical level, for most professions “pretty good” isn’t good enough. It’s not good enough to have code that’s right 90% of the time but broken (or worse, has subtle bugs) the rest of the time.



Humans hallucinate information and often get things wrong in ways that have no grounding in auditable facts either.

The fact that a textbox can do so many diverse tasks _well_, should give everyone pause.

Here's a few things it was able to do when I tested it:

- generate working code in multiple programming languages (C++, Rust, Typescript, Python)

- rewrite terraform tf to equivalent kubernetes yaml

- accurately describe esoteric knowledge related to medical imaging

- find and suggest improvements in code written by senior programmers

- rewrite and improve the copy of a website

- create a decent presentation outline for a VC investor pitch

- suggest valid improvements to sample startup mission and vision statements

- expand bullet points into a proper email that I could send out to third parties without any questions raised

How many people are there in the world that can do all or even some of the above at a decent level of expertise?


> How many people are there in the world that can do all or even some of the above at a decent level of expertise?

If you tell them what to do, then correct them about all the things they are wrong about, then a lot of people can do all of those as long as they have access to Google.

And then once those people have done that a while they will be able to continue doing those things without your feedback. But ChatGPT can't. This makes it fundamentally different from any human.


If I am understanding correctly, your main point of differentiation is that the language model doesn't learn from its conversations.

Compared to the initial training of the model, this is a trivial amount of engineering effort and is likely something we will see within a year or less.


> this is a trivial amount of engineering effort

I disagree and I have worked on Google search ranking, making models that learn is ridiculously hard. This model is impressive, but it still hasn't solved this part, and until they do solve it the blocker isn't engineering effort but research effort with unknown timeframes.

When researches says a model "learns" all they mean is that they put the new data into the model, but the model is still as stupid as before so it doesn't really solve the real kind of learning humans do and the model would need to be able to do in order to be useful here.


After a few days playing with this and using it for real work in some cases (having it bang out some PowerShell based on a description and follow-up modifications), I'm not sure that "the real kind of learning humans do" is even a necessary goal anymore.

Here is a language model that doesn't "know" anything, it doesn't "understand" anything, it has no idea what an AST is or what the code it is producing does… But does it really matter? If that prompt "generate a PowerShell script that does X Y and Z" results in accurate code that meets the stated requirement, how it got there is an implementation detail.

Give me what exists today, give it an ongoing knowledge of the things I am conversing with it on, take off the stupid guardrails and this is something I would gladly pay a significant amount of money every month for.


From my rather limited understanding "learning from the conversation" is already an existing feature that is simply limited to a "thread" session for users with the current interface. I guess feeding those back to the model is ultimately the goal of the current beta test though, the marketing material hints at it at least.


That’s the rub, though. The bar for most tasks isn’t “decent” level of expertise. We want genuine expertise. It doesn’t matter if your Rust developer Jerry also knows how to write Italian operas about SpongeBob. He needs to write code that is big-free or be able to address bugs as they come up. As long as SOTA models are only “decent” Jerry keeps his job.

If it sounds like I’m moving the goalposts, I’m not. I acknowledge that this is impressive in the abstract. It’s fun to play around with. But I’m also predicting that we’re at a local maxima: there are diminishing returns to the architectures we’ve developed so far. Throwing more data and compute won’t solve the problems we have.


> Ask ChatGPT to play chess. It can generate a text based board and prompt you for moves, but it can’t reliably update its board correctly or even find legal moves. Note that I don’t expect it to play good moves, but the fact that it can’t even play legal moves should tell us something about its internal state.

Incidentally, I tried handing it a few partial games in algebraic notation and asking it to suggest the next move, and it generally suggested legal moves, though with tactical explanations that ranged from plausible to nonsensical. It refused to actually play chess with me though and I guess I just didn't have the right prompt.


>There’s only so far you can get by memorizing the textbook. //

If a person does that they know they're memorising a text book, it gets different wait to pyramid marketing schemes, no less sincere in some cases, monologue about how a crystal can cure all your ailments.

Does ChatGPT know to apply [fallacious!] authority to sources, chess.com is a better source than r/anarchychess, but still a game between two novices on chess.com wouldn't be a good training guide, et cetera.

A lot of web content is subtly wrong, that's always the challenge when searching ...

Now, 90% sounds pretty good compared to humans ... ?! (Not sure if I'm being sarcastic there or not!)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: