More

password54321 · 2025-10-17T19:18:10 1760728690

Can perform out of distribution tasks at least around average human level performance.

benzible · 2025-10-17T21:42:50 1760737370

Every attempt to formally define "general intelligence" for humans has been a shitshow. IQ tests were literally designed to justify excluding immigrants and sterilizing the "feeble-minded." Modern psychometrics can't agree on whether intelligence is one thing (g factor) or many things, whether it's measurable across cultures, or whether the tests measure aptitude or just familiarity with test-taking and middle-class cultural norms.

Now we're trying to define AGI - artificial general intelligence - when we can't even define the G, much less the I. Is it "general" because it works across domains? Okay, how many domains? Is it "general" because it can learn new tasks? How quickly? With how much training data?

The goalposts have already moved a dozen times. GPT-2 couldn't do X, so X was clearly a requirement for AGI. Now models can do X, so actually X was never that important, real AGI needs Y. It's a vibes-based marketing term - like "artificial intelligence" was (per John McCarthy himself) - not a coherent technical definition.

password54321 · 2025-10-18T10:33:00 1760783580

I think you are overthinking this. The ARC benchmark for fluid abstracting reasoning was made in 2019 and it still hasn't been 'solved'. So the goalposts aren't moving as much as you think they are.

LLMs or neural nets have never been good with out of distribution tasks.

password54321 · 2025-10-17T19:16:19 1760728579

You got to look at how it scales. LLMs have already stopped increasing in parameter count as they don't get better by scaling them up anymore. New ideas are needed.

throwaway-0001 · 2025-10-17T19:22:11 1760728931

You’re right… but still, what was done until today is significant and useful already

password54321 · 2025-10-17T19:13:49 1760728429

Turing Test was a thought experiment not a real benchmark for intelligence. If you read the paper the idea originated from it is largely philosophical.

As for abstract reasoning, if you look at ARC-2 it is barely capable though at least some progress has been made with the ARC-1 benchmark.

ecocentrik · 2025-10-18T11:10:45 1760785845

I wasn't claiming the Turing Test was a benchmark for intelligence but the ability to fool a human into thinking a machine is intelligent in conversation is still a significant milestone. I should have said "some abstract reasoning". ARC-2 looks promising.

password54321 · 2025-10-18T13:15:19 1760793319

>I wasn't claiming the Turing Test was a benchmark for intelligence but the ability to fool a human into thinking a machine is intelligent in conversation is still a significant milestone.

The Turing Test is whether it can fool a human into thinking it is talking to another human not an intelligent machine. And ironically this is becoming less true over time as people become more used to spotting the tendencies LLMs have with writing such as its frequent use of dashes or "it's not just X it is Y" type of statements.

password54321 · 2025-10-17T19:01:44 1760727704

AI psychosis is already here.

password54321 · 2025-10-17T18:59:09 1760727549

They can't even automate chat support the very thing you would think LLMs would be good at. Yet I always end up needing to talk to a person.

password54321 · 2025-10-15T18:36:51 1760553411

This is not how LLMs work. You aren't 'unlocking' the "Truth" as it doesn't know what the "Truth" is. It is just pattern matching to words that match the style you are looking for. It may be more accurate for you in some cases but this is not a "Truth" instruction set as there is no such thing.

bwfan123 · 2025-10-15T19:24:29 1760556269

addendum: The ground truth for an LLM is the training dataset. Whereas the ground truth for a human is their own experience/qualia with actions in the world. You may argue that only a few of us are willing to engage with the world - and we take most things as told just like the LLMs. Fair enough. But we still have the option to engage with the world, and the LLMs dont.

jonplackett · 2025-10-15T21:19:53 1760563193

The LLMs we get to use have been prompt engineered and post-trained so much that I doubt the training data is their main influence anymore. If it was you couldn’t change their entire behaviour by adding a few sentences to the personalisation section.

unshavedyak · 2025-10-15T20:17:14 1760559434

I'm just an ignorant bystander, but is the training dataset the ground truth?

Kind of feels like calling the fruit you put into the blender the ground truth, but the meaning of the apple is kinda lost in the soup.

Now i'm not a hater by any means. I am just not sure this is the correct way to define the structured "meaning" (for lack of a better word) that we see come out of LLM complexity. It is, i thought, a very lossy operation and so the structure of the inputs may or (more likely) may not provide a like-structured output.

password54321 · 2025-10-10T19:28:18 1760124498

People just don't want to face the uncomfortable truth that they are probably not as smart as they want to believe they are.

mattgreenrocks · 2025-10-10T20:00:07 1760126407

It’s not about being smart as much as staying engaged with the problem even when it mentally defeats you several times.

password54321 · 2025-10-11T10:48:13 1760179693

That's what I mean. People are uncomfortable being wrong. We are taught from school that being wrong makes you less adequate.

mattgreenrocks · 2025-10-12T20:37:28 1760301448

I see. You are correct. Wrong is merely feedback on something outside of us, not a value judgment of us as people. But education and other systems need us to believe the latter at some low level so they can retain authority.

password54321 · 2025-10-09T13:48:20 1760017700

Why do web developers feel threatened that someone just built a web framework for fun?

jermaustin1 · 2025-10-09T14:05:32 1760018732

As a web developer who's first paid web site was in 1998 when I was 10-years-old, my favorite thing to do in my spare time is build web frameworks that I will never use.

- I've done CSS frameworks that replicate most of bootstrap that I use.

- I've made client-side reactive web-components (kind of) that almost replaced the parts of react that I like.

- I've built bespoke HTTP servers countless times since the VB6 days.

- And I've written my own MVC engines probably a half dozen times, just to learn a new language or library.

All of that to say, it isn't web devs who are threatened, it is developers who don't want to learn the underlying technologies that power the libraries and frameworks they use.

I actually see no fault in being that way. I've know tons of decent-to-good developers that have no desire to understand HTTP or Vanilla JavaScript, and they still do great work tying systems together. It's all about the kind of learner you are. Do you want depth, breadth, or a mixture of both (but always lacking in both - aka me).

sixtyj · 2025-10-09T14:17:03 1760019423

That is what I wanted to say too… but I did it wrong way in previous comment.. oops

jermaustin1 · 2025-10-09T14:57:21 1760021841

We all trip up on our words sometimes. To err is human.

password54321 · 2025-10-09T12:46:03 1760013963

Increasingly bloated and complicated frameworks with intangible benefits used for webpages that are now just training data for LLMs is much more important.

ToucanLoucan · 2025-10-09T12:54:59 1760014499

[flagged]

johnmaguire · 2025-10-09T13:24:07 1760016247

Usually that's indicative of an ad blocker firing in my experience.

ToucanLoucan · 2025-10-09T13:34:17 1760016857

I mean if it was the ad-blocker, I don't see how a refresh would fix that? But I'm not terribly knowledgeable here so feel free to correct me.

johnmaguire · 2025-10-09T13:41:54 1760017314

Ah you hadn't mentioned a refresh fixed it. That sure sounds like some kind of race in the JS.

ToucanLoucan · 2025-10-09T13:46:05 1760017565

Very fair! I only came back to edit it because right after leaving that comment I went to see if Best Buy had something I needed locally, clicked into search, typed, hit enter, and it fucking broke. Seemingly entirely, even the search button didn't work, so cmd+a, cmd+c, cmd+r, click in again, paste, enter, and that worked.

I just fucking loathe how common this experience is now. Amazon seems to be the only one that doesn't do it, but I've experienced this exact issue on Best Buy, Target, Etsy, Mercari, ebay, and it just DRIVES ME UP THE WALL.

password54321 · 2025-10-08T13:04:35 1759928675

It is all just data. It doesn't need to be rendered to become input.