Hacker Newsnew | past | comments | ask | show | jobs | submit | viccis's commentslogin

It seems a lot like PR. Much like their posts about "AI welfare" experts who have been hired to make sure their models welfare isn't harmed by abusive users. I think that, by doing this, they encourage people to anthropomorphize more than they already do and to view Anthropic as industry leaders in this general feel-good "responsibility" type of values.

Vector spaces and bag of words models are not specifically related to LLMs, so I think that's irrelevant to this topic. It's not about "knowledge", just the ability to represent words in such a way that similarities between them take on useful computational characteristics.

Well, pretty much all of the LLMs are based on the decode-only version of the Transformer architecture (in fact it’s the T in GPT).

And in the Transformer architecture you’re working with embeddings, which are exactly what this article is about, the vector representation of words.


I really think you should actually read the article. None of what you are saying has to do with the content of it, and it will explain how you can do arithmetic with these words.

From the guidelines https://news.ycombinator.com/newsguidelines.html

> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that".

Besides which, this is totally a valid question based on the article. (The temptation to ask if you read it is almost overwhelming!) It talks about how to do arithmetic but not what the result of that will necessarily be, so I don't see that any part of it answers the question of "cash is king" + "female" - "male".


>which is not a social network, but I’m tired of arguing with people online about it

I know this was a throwaway parenthetical, but I agree 100%. I don't know when the meaning of "social media" went from "internet based medium for socializing with people you know IRL" to a catchall for any online forum like reddit, but one result of this semantic shift is that it takes attention away from the fact that the former type is all but obliterated now.


> the former type is all but obliterated now.

Discord is the 9,000lb gorilla of this form of social media, and it's actually quietly one of the largest social platforms on the internet. There's clearly a desire for these kinds of spaces, and Discord seems to be filling it.

While it stinks that it is controlled by one big company, it's quite nice that its communities are invite-only by default and largely moderated by actual flesh-and-blood users. There's no single public shared social space, which means there's no one shared social feed to get hooked on.

Pretty much all of my former IRC/Forum buddies have migrated to Discord, and when the site goes south (not if, it's going to go public eventually, we all know how this story plays out), we expect that we'll be using an alternative that is shaped very much like it, such as Matrix.


> Discord is the 9,000lb gorilla of this form of social media, and it's actually quietly one of the largest social platforms on the internet. There's clearly a desire for these kinds of spaces, and Discord seems to be filling it.

The "former type" had to do with online socializing with people you know IRL.

I have never seen anything on Discord that matches this description.


I'm in multiple Discord servers with people I know IRL.

In fact, I'd say it's probably the easiest way to bootstrap a community around a friend-group.


Is this a generational thing? All my groups of this type are on WhatsApp (unfortunately).

Yes. Whatsapp requires a phone number and Discord does not. The tweens who do not have a phone yet can join Discord with their siblings / friends.

The other part of this is that Discord has official University hubs, so the college kids are all in there. You need an email address from that Univeristy to join: https://support.discord.com/hc/en-us/articles/4406046651927-...


Are you using the word tweens in some sense other than its usual definition of pre-teen? My understanding is that discord, like most online services, requires registered users to be 13 years old.

Nope, that's exactly what I meant. That requirement just means that they have to check a box which says that they're 13 or older. Surely no child would ever break the rules, right?

Having been out of university since before Discord was much of a thing, that's news to me. It also is eerily reminiscent of Facebook's beginning sign up requirements.

I guess that depends on the University and whether or not you get to keep your email address after you graduate. From what I understand from my college-aged kids, most people get kicked out of the hub after they graduate.

It's similar in Apple's strategy of trying to get Macintosh into the classrooms (in the 80s/90s), and student discounts on Adobe products.

I am not a huge fan of Discord, although I do use it. It's very good at what it does, and the communities it houses are well moderated, at least the ones that I have joined. I dislike that they've taken over communities and walled them off from the "searchable" internet.


That is actually something I quite like about Discord. Whatever I write and post, while not "private" is not indexed or searchable by anyone other tgan those people that have been vetted (invited) by the respective community. Not that I'm mostly on small friendgroup Discords with 10 - 100 members.

Right, and they're blending two things together -- group chats and public forums. I'm sad about losing the public forums.

Discord's initial core demographic was online gaming. From there it has radiated outwards due to being the best group messaging (and voice chat) solution out there. The more overlap your friend group has with gaming and adjacent groups the more likely they are to use Discord

When Bloomberg’s podcasts have a Discord channel (eg: Odd Lots), you know it has broken free of its gaming origins.

Definitely. My friend group consists of gen-z and millenials and we met irl but have a shared interest of gaming that gets us together on weekends.

What Discord does so well is that you start using it for gaming but then it also becomes the space for all kinds of things. We discuss news there, music/popculture, organize events etc

Whatsapp is more for the "formal" stuff and when it's time critical since not everybody has Discord notifications enabled.

I'd say Discord is definitely more popular among gen-z (or even younger) and gamers but it's kinda become reddit 2.0 where every niche has its discord.


I want my community to have:

- bots (like we had on IRC) - first class clients on all platforms (mobile, tablet, desktop, browser) - voice chat - video chat

Telegram and Discord are the only ones that satisfy all these.

And of these Telegram is just one channel, on Discord we can separate subjects by channels in seconds. If I see a message on #general, I go check what it is. On #memes I know it's not urgent.

Matrix if you want to play IT support on your free time.


might be a regional thing instead, i don't know many americans with whatsapp -- all of my friends are on discord.

Maybe, but at least in my circles it’s a structure thing- until the group actually can be organised in a single chat sanely something else will be used- but as soon as multiple chats are required the thing is moved on discord.

You're essentially saying you haven't seen anyone's private chats.

I'm in a friend Discord server. It's naturally invisible unless someone sends you an invite.


Yeah same as sibling comments, I'm in multiple discord servers for IRL friend groups. I personally run one with ~50 people that sees a hundreds of messages a day. By far my most used form of social media. Also as OP said, I'll be migrating to Matrix (probably) when they IPO, we've already started an archival project just in case.

And you won't. I will NOT invite anyone from "social media" to any of the 3 very-private, yet outrageously active, servers, and that's why they have less than 40 users collectively. They're basically for playing games and re-streaming movies among people on first name basis or close to it. And I know those 40 people have others of their own, and I know I'll never ever have access to them either. Because I dont know those other people in them.

And I know server like these are in the top tier of engagement for discord on the whole because they keep being picked for AB testing new features. Like, we had activities some half a year early. We actually had the voice modifiers on two of them, and most people don't even know that was a thing.


The split where social networking is mostly for people you “know” and social media is… some other thing, mostly for scrolling videos, definitely is significant.

But, the “know IRL” split is a bit artificial I think. For example my discord is full of people I knew in college: I knew them IRL for four years, and then we all moved around and now we’ve known each other online for decades. Or childhood friends. By now, my childhood friend and college friend circles are partially merged on discord, and they absolutely know each other (unfortunately there’s no way for you to evaluate this but I know them all quite well and it would be absurd to me, to consider them anything other than friends).

The internet is part of the real world now. People socialize on it. I can definitely see a distinction between actually knowing somebody, and just being in a discord channel with them. But it is a fuzzy social thing I think, hard to nail down exactly where the transition is (also worth noting that we have acquaintances that we don’t really “know” offline, the cashier at our favorite shops for example).


While it's also used to socialize with people you don't know IRL, most of my experience with Discord (mostly in uni) was to aggregate people IRL together. We had discords for clubs, classes, groups of friends, etc. The only reason I use discord now is for the same reason. Private space for a group of people to interact asynchronously in a way that's more structured than a text group chat.

Idk most of the people I "met" on the internet happened originally on IRC. I didn't know them till a decade or more later.

I'm sorry but what?! 'Socializing with people you know IRL' is almost exclusively what I've seen Discord used for, and almost solely what I personally use it for. There are vastly more Discord servers set up among IRL friend groups (or among classmates, as another popular use case) than there are Discord servers for fandoms of people who have never met IRL.

I'd say WhatsApp is a better example

WhatsApp really feels to me more like group chat. Not really breaking barrier of social media. But then again I am not in any mass chats.

Discord is many things. Private chat groups, medium communities and then larger communities with tens of thousands of users.


> WhatsApp really feels to me more like group chat.

So what's wrong with that?


900 lb. Gorillas don't weigh 9000 lbs

> "internet based medium for socializing with people you know IRL"

"Social media" never meant that. We've forgotten already, but the original term was "social network" and the way sites worked back then is that everyone was contributing more or less original content. It would then be shared automatically to your network of friends. It was like texting but automatically broadcast to your contact list.

Then Facebook and others pivoted towards "resharing" content and it became less "what are my friends doing" and more "I want to watch random media" and your friends sharing it just became an input into the popularity algorithm. At that point, it became "social media".

HN is neither since there's no way to friend people or broadcast comments. It's just a forum where most threads are links, like Reddit.


I think most people only recall becoming aware of Facebook when it was already so widespread that people talked about it as "the site you go to to find out what extended family members and people you haven't spoken to in years are up to".

Let's remember that the original idea was to connect with people in your college/university. I faintly recall this time period because I tried to sign up for it only to find out that while there had been an announcement that it was opened up internationally, it still only let you sign up with a dot EDU email address, which none of the universities in my country had.

In the early years "social media" was a lot more about having a place to express yourself or share your ideas and opinions so other people you know could check up on them. Many remember the GIF anarchy and crimes against HTML of Geocities but that aesthetic also carried over to MySpace while sites like Live Journal or Tumblr more heavily emphasized prose. This was all also in the context of a more open "blogosphere" where (mostly) tech nerds would run their own blogs and connect intentionally much like "webrings" did in the earlier days for private homepages and such before search engine indexing mostly obliterated their main use.

Facebook pretty much created modern "social media" by creating the global "timeline", forcing users to compete with each other (and corporate brands) for each other's attention while also focusing the experience more on consumption and "reaction" than creation and self-expression. This in turn resulted in more "engagement" which eventually led to algorithmic timelines trying to optimize for engagement and ad placement / "suggested content".

HN actually follows the "link aggregator" or "news aggregator" lineage of sites like Reddit, Digg, Fark, etc (there were also "bookmark aggregators" like stumbleupon but most of those died out to). In terms of social interactions it's more like e.g. the Slashdot comment section even though the "feed" is somewhat "engagement driven" like on social media sites. But as you said, it lacks all the features that would normally be expected like the ability to "curate" your timeline (or in fact, having a personalized view of the timeline at all) or being able to "follow" specific people. You can't even block people.


It's even worse than that, TikTok & Instagram are labeled "social media" despite, I'd wager, most users never actually posting anything anymore. Nobody really socializes on short form video platforms any more than they do YouTube. It's just media. At least forums are social, sort of.

I'll come clean and say I've still never tried Discord and I feel like I must not be understanding the concept. It really looks like it's IRC but hosted by some commercial company and requiring their client to use and with extremely tenuous privacy guarantees. I figure I must be missing something because I can't understand why that's so popular when IRC is still there.

IRC has many many usability problems which I'm sure you're about to give a "quite trivial curlftpfs" explaination for why they're unimportant - missing messages if you're offline, inconsistent standards for user accounts/authentication, no consensus on how even basic rich text should work much less sending images, inconsistent standards for voice calls that tend to break in the presence of NAT, same thing for file transfers...

It is IRC, but with modern features and no channel splits. It also adds voice chats and video sharing. Trade off is that privacy and commercial platform. On other hand it is very much simpler to use. IRC is a mess of usability really. Discord has much better user experience for new users.

> Discord has much better user experience for new users.

Until you join a server that gives you a whole essay of what you can and cannot do with extra verification. This then requiring you to post in some random channel waiting for the moderator to see your message.

You're then forced to assign roles to yourself to please a bot that will continue to spam you with notifications announcing to the community you've leveled up for every second sentence. Finally, everyone glaring at you in channel or leaving you on read because you're a newbie with a leaf above your username. Each to their own, I guess.

/server irc.someserver.net

/join #hello

/me says Hello

I think I'll stick with that.

At least Discord and IRC are interchangeable in the sake of idling.


This only happens if the server is BIG _or_ if the admin is a grifter who's 100% sure their server will hit it big and has 120 channels and 40 bots and 9 users total.

I was a heavy IRC user in 2015 before Discord and even though I personally prefer using IRC, it was obvious it would take over the communities I was for a few reasons:

1. People don't understand or want to setup a client that isn't just loading some page in their browser 2. People want to post images and see the images they posted without clicking through a link, in some communities images might be shared more than text. 3. People want a persistent chat history they can easily access from multiple devices/notifications etc 4. Voice chat, many IRC communities would run a tandem mumble server too.

All of these are solvable for a tech-savvy enough IRC user, but Discord gets you all of this out of the box with barely more than an email account.

There are probably more, but these are the biggest reasons why it felt like within a year I was idling in channels by myself. You might not want discord but the friction vs irc was so low that the network effect pretty much killed most of IRC.


Discord has a better client and doesn't require an always-on connection to stay connected.

Yes IRCv3 exists and can do the backlog filling thing. Nobody uses that.

If "privacy" was an issue for you, you do know that IRC server admins could and still can see every single message on their servers unless everyone is using an encryption plugin on the channel?

The thing that Discord replicates with IRC is limited audience. Unless the server is public, I know I can be more open because I trust the people in there not to be assholes and share private stuff with others.


Because it's the equivalent to running a private irc server plus logging with forum features, voice comms, image hosting, authentication and bouncers for all your users. With a working client on multiple platforms (unlike IRC and jabber that never really took off on mobile).

it's very easy to make a friend server that has all you basically need: sending messages, images/files and being able to talk with voice channels.

you can also invite a music bot or host your own that will join the voice channel with a song that you requested


Right.... how is that different from IRC other than being controlled by a big company with no exit ability and (again) extremely tenuous privacy promises?

IRC doesn't offer voice/video, which is unimaginable for Discord alternative.

When we get to alternative proposals with functioning calls I'd say having them as voice channels that just exist 24/7 is a big thing too. It's a tiny thing from technical perspective, but makes something like Teams unsuitable alternative for Discord.

In Teams you start a call and everyone phone rings, you distract everyone from whatever they were doing -- you better have a good reason for doing so.

In Discord you just join empty voice channel (on your private server with friends) w/o any particular reason and go on with your day. Maybe someone sees that you're there and joins, maybe not. No need to think of anyone's schedule, you don't annoy people that don't have time right now.


For the text chat, it's different in the way that it lets one make their own 'servers' without having to run the actual hardware server 24/7, free of charge, no need to battle with NATs and weird nonstandard ways of sending images, etc.

The big thing is the voice/videoconferencing channels which are actually optimized insanely well, Discord calls work fine even on crappy connections that Teams and Zoom struggle with.

Simply put it's Skype x MSN Messenger with a global user directory, but with gamers in mind.


> I don't know when the meaning of "social media" went from "internet based medium for socializing with people you know IRL" to a catchall

5 minutes after the first social network became famous. It never really has been just about knowing people IRL, that was only in the beginning, until people started connecting with everyone and their mother.

Now it's about people and them connecting and socializing. If there are persons, then it's social. HN has profiles where you can "follow" people, thus, it's social on a minimal level. Though, we could dispute whether it's just media or a mature network. Because there obviously are notable differences in terms of social-related features between HN or Facebook.


You know Meta, the "social media company" came out and said their users spend less than 10% of the time interacting with people they actually know?

"Social Media" had become a euphemism for 'scrolling entertainment, ragebait and cats' and has nothing to do 'being social'. There is NO difference between modern reddit and facebook in that sense. (Less than 5% of users are on old.reddit, the majority is subject to the algorithm.)


It would be nice if the social media company actually showed their posts in the feed

I'm hopeful that one day engineers at Meta will crack the chronological sort code. It's a tough algorithm but I bet Llama can help 'em out.

Better back button handling and fixing the location bugs in event creation may well be entirely beyond Llama, sadly.


I don't go to Meta properties to interact with anyone anymore.

Our extended family has a WhatsApp group, but that's about it. I'm not convincing a bunch of 60+ year olds to switch to a new platform they're not used to.

Instagram has been read-only for me for half a decade, I still poke my head on FB now and then to spy on people I used to know.


The social networks have all added public media and algorithms. I read explanation that because friends don't produce enough content to keep engaged so they added public feeds. I'm disappointed that there isn't a private Bluesky/Mastodon. I also want an algorithm that shows the best of what following posted since last checked so I can keep up.

The first blog post or two is great, but it eventually goes down a really annoying grindset pop psychology route.

>It's insane to me that so many people need these to get off the processed foods killing them in the US.

If you understood how super stimulants work, then you wouldn't have found it "insane."


LLMs are models that predict tokens. They don't think, they don't build with blocks. They would never be able to synthesize knowledge about QM.

I am a deep LLM skeptic.

But I think there are also some questions about the role of language in human thought that leave the door just slightly ajar on the issue of whether or not manipulating the tokens of language might be more central to human cognition than we've tended to think.

If it turned out that this was true, then it is possible that "a model predicting tokens" has more power than that description would suggest.

I doubt it, and I doubt it quite a lot. But I don't think it is impossible that something at least a little bit along these lines turns out to be true.


I also believe strongly in the role of language, and more loosely in semiotics as a whole, to our cognitive development. To the extent that I think there are some meaningful ideas within the mountain of gibberish from Lacan, who was the first to really tie our conception of ourselves with our symbolic understanding of the world.

Unfortunately, none of that has anything to do with what LLMs are doing. The LLM is not thinking about concepts and then translating that into language. It is imitating what it looks like to read people doing so and nothing more. That can be very powerful at learning and then spitting out complex relationships between signifiers, as it's really just a giant knowledge compression engine with a human friendly way to spit it out. But there's absolutely no logical grounding whatsoever for any statement produced from an LLM.

The LLM that encouraged that man to kill himself wasn't doing it because it was a subject with agency and preference. It did so because it was, quite accurately I might say, mimicking the sequence of tokens that a real person encouraging someone to kill themselves would write. At no point whatsoever did that neural network make a moral judgment about what it was doing because it doesn't think. It simply performed inference after inference in which it scanned through a lengthy discussion between a suicidal man and an assistant that had been encouraging him and then decided that after "Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s " the most accurate token would be "clar" and then "ity."


The problem with all this is that we don't actually know what human cognition is doing either.

We know what our experience is - thinking about concepts and then translating that into language - but we really don't know with much confidence what is actually going on.

I lean strongly toward the idea that humans are doing something quite different than LLMs, particularly when reasoning. But I want to leave the door open to the idea that we've not understood human cognition, mostly because our primary evidence there comes from our own subjective experience, which may (or may not) provide a reliable guide to what is actually happening.


>The problem with all this is that we don't actually know what human cognition is doing either.

We do know what it's not doing, and that is operating only through reproducing linguistic patterns. There's no more cause to think LLMs approximate our thought (thought being something they are incapable of) than that Naive-Bayes spam filter models approximate our thought.


My point is that we know very little about the sort of "thought" that we are capable of either. I agree that LLMs cannot do what we typical refer to as "thought", but I thnk it is possible that we do a LOT less of that than we think when we are "thinking" (or more precisely, having the experience of thinking).

How does this worldview reconcile the fact that thought demonstrably exists independent of either language or vision/audio sense?

I don't see a need to reconcile them.

Which is why it's incoherent!

I'm not clear that it has to be coherent at this point in the history of our understanding of cognition. We barely know what we're even talking about most of the time ...

>Unfortunately, none of that has anything to do with what LLMs are doing. The LLM is not thinking about concepts and then translating that into language. It is imitating what it looks like to read people doing so and nothing more.

'Language' is only the initial and final layers of a Large Language Model. Manipulating concepts is exactly what they do, and it's unfortunate the most obstinate seem to be the most ignorant.


They do not manipulate concepts. There is no representation of a concept for them to manipulate.

It may, however, turn out that in doing what they do, they are effectively manipulating concepts, and this is what I was alluding to: by building the model, even though your approach was through tokenization and whatever term you want to use for the network, you end up accidentally building something that implicitly manipulates concepts. Moreover, it might turn out that we ourselves do more of this than we perhaps like to think.

Nevertheless "manipulating concepts is exactly what they do" seems almost willfully ignorant of how these systems work, unless you believe that "find the next most probable sequence of tokens of some length" is all there is to "manipulating concepts".


>They do not manipulate concepts. There is no representation of a concept for them to manipulate.

Yes, they do. And of course there is. And there's plenty of research on the matter.

>It may, however, turn out that in doing what they do, they are effectively manipulating concepts

There is no effectively here. Text is what goes in and what comes out, but it's by no means what they manipulate internally.

>Nevertheless "manipulating concepts is exactly what they do" seems almost willfully ignorant of how these systems work, unless you believe that "find the next most probable sequence of tokens of some length" is all there is to "manipulating concepts".

"Find the next probable token" is the goal, not the process. It is what models are tasked to do yes, but it says nothing about what they do internally to achieve it.


please pass on a link to a solid research paper that supports the idea that to "find the next probable token", LLM's manipulate concepts ... just one will do.

Revealing emergent human-like conceptual representations from language prediction - https://www.pnas.org/doi/10.1073/pnas.2512514122

Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task - https://openreview.net/forum?id=DeG07_TcZvT

On the Biology of a Large Language Model - https://transformer-circuits.pub/2025/attribution-graphs/bio...

Emergent Introspective Awareness in Large Language Models - https://transformer-circuits.pub/2025/introspection/index.ht...


Thanks for that. I've read the two Lindsey papers before. I think these are all interesting, but they are also what used to be called "just-so stories". That is, they describe a way of understanding what the LLM is doing, but do not actually describe what the LLM is doing.

And this is OK and still quite interesting - we do it to ourselves all the time. Often it's the only way we have of understanding the world (or ourselves).

However, in the case of LLMs, which are tools that we have created from scratch, I think we can require a higher standard.

I don't personally think that any of these papers suggest that LLMs manipulate concepts. They do suggest that the internal representation after training is highly complex (superposition, in particular), and that when inputs are presented, it isn't unreasonable to talk about the observable behavior as if it involved represented concepts. It is useful stance to take, similar to Dennett's intentional stance.

However, while this may turn out to be how a lot of human cognition works, I don't think it is what is the significant part of what is happening when we actively reason. Nor do I think it corresponds to what most people mean by "manipulate concepts".

The LLM, despite the prescence of "features" that may correspond to human concepts, is relentlessly forward-driving: given these inputs, what is my output? Look at the description in the 3rd paper of the arithmetic example. This is not "manipulating concepts" - it's a trick that often gets to the right answer (just like many human tricks used for arithmetic, only somewhat less reliable). It is extremely different, however, from "rigorous" arithmetic - the stuff you learned when you somewhere between age 5 and 12 perhaps - that always gives the right answer and involves no pattern matter, no inference, no approximations. The same thing can be said, I think, about every other example in all 4 papers, to some degree or another.

What I do think is true (and very interesting) is that it seems somewhere between possible and likely that a lot more human cognition than we've previously suspected uses similar mechanisms as these papers are uncovering/describing.


>That is, they describe a way of understanding what the LLM is doing, but do not actually describe what the LLM is doing.

I’m not sure what distinction you’re drawing here. A lot of mechanistic interpretability work is explicitly trying to describe what the model is doing in the most literal sense we have access to: identifying internal features/circuits and showing that intervening on them predictably changes behavior. That’s not “as-if” gloss; it’s a causal claim about internals.

If your standard is higher than “we can locate internal variables that track X and show they causally affect outputs in X-consistent ways,” what would count as “actually describing what it’s doing”?

>However, in the case of LLMs, which are tools that we have created from scratch, I think we can require a higher standard.

This is backwards. We don’t “create them from scratch” in the sense relevant to interpretability. We specify an architecture template and a training objective, then we let gradient descent discover a huge, distributed program. The “program” is not something we wrote or understand. In that sense, we’re in a similar epistemic position as neuroscience: we can observe behavior, probe internals, and build causal/mechanistic models, without having full transparency.

So what does “higher standard” mean here, concretely? If you mean “we should be able to fully enumerate a clean symbolic algorithm,” that’s not a standard we can meet even for many human cognitive skills, and it’s not obvious why that should be the bar for “concept manipulation.”

>I don't personally think that any of these papers suggest that LLMs manipulate concepts. They do suggest that the internal representation after training is highly complex (superposition, in particular), and that when inputs are presented, it isn't unreasonable to talk about the observable behavior as if it involved represented concepts. It is useful stance to take, similar to Dennett's intentional stance.

You start with “there is no representation of a concept,” but then concede “features that may correspond to human concepts.” If those features are (a) reliably present across contexts, (b) abstract over surface tokens, and (c) participate causally in producing downstream behavior, then that is a representation in the sense most people mean in cognitive science. One of the most frustrating things about these sorts of discussions is the meaningless semantic games and goalpost shifting.

>The LLM, despite the prescence of "features" that may correspond to human concepts, is relentlessly forward-driving: given these inputs, what is my output?

Again, that’s a description of the objective, not the internal computation. The fact that the training loss is next-token prediction doesn’t imply the internal machinery is only “token-ish.” Models can and do learn latent structure that’s useful for prediction: compressed variables, abstractions, world regularities, etc. Saying “it’s just next-token prediction” is like saying “humans are just maximizing inclusive genetic fitness,” therefore no real concepts. Goal ≠ mechanism.

> Look at the description in the 3rd paper of the arithmetic example. This is not "manipulating concepts" - it's a trick that often gets to the right answer

Two issues:

1. “Heuristic / approximate” doesn’t mean “not conceptual.” Humans use heuristics constantly, including in arithmetic. Concept manipulation doesn’t require perfect guarantees; it requires that internal variables encode and transform abstractions in ways that generalize.

2. Even if a model is using a “trick,” it can still be doing so by operating over internal representations that correspond to quantities, relations, carry-like states, etc. “Not a clean grade-school algorithm” is not the same as “no concepts.”

>Rigorous arithmetic… always gives the right answer and involves no pattern matching, no inference…

“Rigorous arithmetic” is a great example of a reliable procedure, but reliability doesn’t define “concept manipulation.” It’s perfectly possible to manipulate concepts using approximate, distributed representations, and it’s also possible to follow a rigid procedure with near-zero understanding (e.g., executing steps mechanically without grasping place value).

So if the claim is “LLMs don’t manipulate concepts because they don’t implement the grade-school algorithm,” that’s just conflating one particular human-taught algorithm with the broader notion of representing and transforming abstractions.


> You start with “there is no representation of a concept,” but then concede “features that may correspond to human concepts.” If those features are (a) reliably present across contexts, (b) abstract over surface tokens, and (c) participate causally in producing downstream behavior, then that is a representation in the sense most people mean in cognitive science. One of the most frustrating things about these sorts of discussions is the meaningless semantic games and goalpost shifting.

I'll see if I can try to explain what I mean here, because I absolutely don't believe this is shifting the goal posts.

There are a couple of levels of human cognition that are particularly interesting in this context. One is the question of just how the brain does anything at all, whether that's homeostasis, neuromuscular control or speech generation. Another is how humans engage in conscious, reasoned thought that leads to (or appears to lead to) novel concepts. The first one is a huge area, better understood than the second though still characterized more by what we don't know than what we do. Nevertheless, it is there that the most obvious parallels with e.g. the Lindsey papers can be found. Neural networks, activation networks and waves, signalling etc. etc. The brain receives (lots of) inputs, generates responses including but not limited to speech generation. It seems entirely reasonable to suggest that maybe our brains, given a somewhat analogous architecture at some physical level to the one used for LLMs, might use similar mechanisms as the latter.

However, nobody would say that most of what the brain does involves manipulating concepts. When you run from danger, when you reach up grab something from a shelf, when you do almost anything except actual conscious reasoning, most of the accounts of how that behavior arises from brain activity does not involve manipulating concepts. Instead, we have explanations more similar to those being offered for LLMs - linked patterns of activations across time and space.

Nobody serious is going to argue that conscious reasoning is not built on the same substrate as unconscious behavior, but I think that most people tend to feel that it doesn't make sense to try to shoehorn it into the same category. Just as it doesn't make much sense to talk about what a text editor is doing in terms of P and N semiconductor gates, or even just logic circuits, it doesn't make much sense to talk about conscious reasoning in terms of patterns of neuronal activation, despite the fact that in both cases, one set of behavior is absolutely predicated on the other.

My claim/belief is that there is nothing inside an LLM that corresponds even a tiny bit to what happens when you are asked "What is 297 x 1345?" or "will the moon be visible at 8pm tonight?" or "how does writer X tackle subject Y differently than writer Z?". They can produce answers, certainly. Sometimes the answers even make significant sense or better. But when they do, we have an understanding of how that is happening that does not require any sense of the LLM engaging in reasoning or manipulating concepts. And because of that, I consider attempts like Lindsey's to justify the idea that LLMs are manipulating concepts to be misplaced - the structures Lindsey et al. are describing are much more similar to the ones that let you navigate, move, touch, lift without much if any conscious thought. They are not, I believe, similar to what is going on in the brain when you are asked "do you think this poem would have been better if it was a haiku?" and whatever that thing is, that is what I mean by manipulating concepts.

> Saying “it’s just next-token prediction” is like saying “humans are just maximizing inclusive genetic fitness,” therefore no real concepts. Goal ≠ mechanism.

No. There's a huge difference between behavior and design. Humans are likely just maximizing genetic fitness (even though that's really a concept, but that detail is not worth arguing about here), but that describes, as you note, a goal not a mechanism. Along the way, they manifest huge numbers of sub-goal directed behaviors (or, one could argue quite convincingly, goal-agnostic behaviors) that are, broadly speaking, not governed by the top level goal. LLMs don't do this. If you want to posit that the inner mechanisms contain all sorts of "behavior" that isn't directly linked to the externally visible behavior, be my guest, but I just don't see this as equivalent. What humans visibly, mechanistically do covers a huge range of things; LLMs do token prediction.


>Nobody would say that most of what the brain does involves manipulating concepts. When you run from danger, when you reach up grab something from a shelf, when you do almost anything except actual conscious reasoning, most of the accounts of how that behavior arises from brain activity does not involve manipulating concepts.

This framing assumes "concept manipulation" requires conscious, deliberate reasoning. But that's not how cognitive science typically uses the term. When you reach for a shelf, your brain absolutely manipulates concepts - spatial relationships, object permanence, distance estimation, tool affordances. These are abstract representations that generalize across contexts. The fact that they're unconscious doesn't make them less conceptual

>My claim/belief is that there is nothing inside an LLM that corresponds even a tiny bit to what happens when you are asked "What is 297 x 1345?" or "will the moon be visible at 8pm tonight?"

This is precisely what the mechanistic interpretability work challenges. When you ask "will the moon be visible tonight," the model demonstrably activates internal features corresponding to: time, celestial mechanics, geographic location, lunar phases, etc. It combines these representations to generate an answer.

>But when they do, we have an understanding of how that is happening that does not require any sense of the LLM engaging in reasoning or manipulating concepts.

Do we? The whole point of the interpretability research is that we don't have a complete understanding. We're discovering that these models build rich internal world models, causal representations, and abstract features that weren't explicitly programmed. If your claim is "we can in principle reduce it to matrix multiplications," sure, but we can in principle reduce human cognition to neuronal firing patterns too.

>They are not, I believe, similar to what is going on in the brain when you are asked "do you think this poem would have been better if it was a haiku?" and whatever that thing is, that is what I mean by manipulating concepts.

Here's my core objection: you're defining "manipulating concepts" as "whatever special thing happens during conscious human reasoning that feels different from 'pattern matching.'" But this is circular and unfalsifiable. How would we ever know if an LLM (or another human, for that matter) is doing this "special thing"? You've defined it purely in terms of subjective experience rather than functional or mechanistic criteria.

>Humans are likely just maximizing genetic fitness... but that describes, as you note, a goal not a mechanism. Along the way, they manifest huge numbers of sub-goal directed behaviors... that are, broadly speaking, not governed by the top level goal. LLMs don't do this.

LLMs absolutely do this, it's exactly what the interpretability research reveals. LLMs trained on "token prediction" develop huge numbers of sub-goal directed internal behaviors (spatial reasoning, causal modeling, logical inference) that are instrumentally useful but not explicitly specified, precisely the phenomenon you claim only humans exhibit. And 'token prediction' is not about text. The most significant advances in robotics in decades are off the back of LLM transformers. 'Token prediction' is just the goal, and I'm tired of saying this for the thousandth time.

https://www.skild.ai/blogs/omni-bodied


HN comment threads are really not the right place for discussions like this.

> Here's my core objection: you're defining "manipulating concepts" as "whatever special thing happens during conscious human reasoning that feels different from 'pattern matching.'" But this is circular and unfalsifiable. How would we ever know if an LLM (or another human, for that matter) is doing this "special thing"? You've defined it purely in terms of subjective experience rather than functional or mechanistic criteria.

I think your core objection is well aligned to my own POV. I am not claiming that the subjective experience is the critical element here, but I am claiming that whatever is going on when we have the subjective experience of "reasoning" is likely to be different (or more specifically, more usefully described in different ways) than what is happening in LLMs and our minds when doing something else.

How would we ever know? Well the obvious answer is more research into what is happening in human brains when we reason and comparing that to brain behavior at other times.

I don't think it's likely to be productive to continue this exchange on HN, but if you would like to continue, my email address is in my profile.


@PaulDavisThe1st I'd love to hear your take on these papers.

Provided above.


The Universe (which others call the Golden Gate Bridge), is composed of an indefinite and perhaps infinite series of spans.

If anything, I feel that current breed of multimodal LLMs demonstrate that language is not fundamental - tokens are, or rather their mutual association in high-dimensional latent space. Language as we recognize it, sequences of characters and words, are just a special case. Multimodal models manage to turn audio, video and text into tokens in the same space - they do not route through text when consuming or generating images.

> manipulating the tokens of language might be more central to human cognition than we've tended to think

I'm convinced of this. I think it's because we've always looked at the most advanced forms of human languaging (like philosophy) to understand ourselves. But human language must have evolved from forms of communication found in other species, especially highly intelligent ones. It's to be expected that the building blocks of it is based on things like imitation, playful variation, pattern-matching, harnessing capabilities brains have been developing long before language, only now in the emerging world of sounds, calls, vocalizations.

Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.


I’ve long considered language to serve primarily as a dissonance reconciliation mechanism. Our behavior is largely shaped by our circumstances and language serves to attribute logic to our behavior after the fact.

>Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.

Are you familiar with the major works in epistemology that were written, even before the 20th century, on this exact topic?


You realize parent said "This would be an interesting way to test proposition X" and you responded with "X is false because I say say", right?

Yes. That is correct. If I told you I planned on going outside this evening to test whether the sun sets in the east, the best response would be to let me know ahead of time that my hypothesis is wrong.

So, based on the source of "Trust me bro.", we'll decide this open question about new technology and the nature of cognition is solved. Seems unproductive.

In addition to what I have posted elsewhere in here, I would point to the fact that this is not indeed an "open question", as LLMs have not produced an entirely new and more advanced model of physics. So there is no reason to suppose they could have done so for QM.

What if making progress today is harder than it was then?

The problem is that it hasn't really made any significant new concepts in physics. I'm not even asking for quantum mechanics 2.0, I'm just asking for a novel concept that, much like QM and a lot of post-classical physics research, formulates a novel way of interpreting the structure of the universe.

"Proposition X" does not need testing. We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user). In the same way that we can reason about the correctness of an IsEven program without writing a unit test that inputs every possible int32 to "prove" it, we can reason about the fundamental principles of an LLM's programming without coming up with ridiculous tests. In fact the proposed test itself is less eminently verifiable than reasoning about correctness; it could be easily corrupted by, for instance, incorrectly labelled data in the training dataset, which could only be determined by meticulously reviewing the entirety of the dataset.

The only people who are serious about suggesting that LLMs could possibly 'think' are the people who are committing fraud on the scale of hundreds of billions of dollars (good for them on finding the all-time grift!) and people who don't understand how they're programmed, and thusly are the target of the grift. Granted, given that the vast majority of humanity are not programmers, and even fewer are programmers educated on the intricacies of ML, the grift target pool numbers in the billions.


> We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user).

Could you elucidate me on the process of human thought, and point out the differences between that and a probabilistic prediction engine?

I see this argument all over the place, but "how do humans think" is never described. It is always left as a black box with something magical (presumably a soul or some other metaphysical substance) inside.


There is no need to involve souls or magic. I am not making the argument that it is impossible to create a machine that is capable of doing the same computations as the brain. The argument is that whether or not such a machine is possible, an LLM is not such a machine. If you'd like to think of our brains as squishy computers, then the principle is simple: we run code that is more complex than a token prediction engine. The fact that our code is more complex than a token prediction engine is easily verified by our capability to address problems that a token prediction engine cannot. This is because our brain-code is capable of reasoning from deterministic logical principles rather than only probabilities. We also likely have something akin to token prediction code, but that is not the only thing our brain is programmed to do, whereas it is the only thing LLMs are programmed to do.

Kant's model of epistemology, with humans schematizing conceptual understanding of objects through apperception of manifold impressions from our sensibility, and then reasoning about these objects using transcendental application of the categories, is a reasonable enough model of thought. It was (and is I think) a satisfactory answer for the question of how humans can produce synthetic a priori knowledge, something that LLMs are incapable of (don't take my word on that though, ChatGPT is more than happy to discuss [1])

1: https://chatgpt.com/share/6965653e-b514-8011-b233-79d8c25d33...


The US generally doesn't build big things like this any more.

Not true, we just only do it if it's part of a DoD budget.

big things seem to require a blank check.

Yep.

Totally disingenuous to say, "Hey! Look at the richest region in the US! They have electric trains. Why can't we?"

Well, make my state the fourth largest economy in the world and we'd have electric trains too.


Those rich regions are setting examples of building transit at way too high a cost for poorer regions to afford.

If we could use other parts of the world as an example - my region is richer than them and could afford it. But those places somehow are never used as the example


I think this is much better as a relative pitch training tool for people with a very basic background in piano and music in general. I would have loved something like this back in high school to use for practicing over and over.

I think "teach" is a high bar, but I do think it's a good practice tool.

My one and only complaint is that sometimes the melodies it generates are tough to play back because they don't really sound like a real melody and I have to fight my brain telling me to play back the one that would actually sound good. Sort of like having to memorize a random string of words vs memorizing a normal sentence.


Ah I remember playing this one. My first MUD/MUSH was Elendor. Looks like it went down in the last year or so. RIP

Aardwolf was fun, though I remember my big gripe was that it had a bunch of weird little themed zones like the Star Trek and Wizard of Oz ones that felt a little hokey to have in there. These days it wins solely by virtue of being one of the few that's still populated and free.

When I tried to go back to the Simultronics ones (Gemstone IV and Dragon Realms), not only was it a ghost town, which made random interactions almost stressful because you feel like two people walking past each other in a ghost town, but they have double, triple, and quadrupled down on squeezing their whales for every cent. A lot of people playing that game are paying $50-100 a month or more, and even normal players have to cough up more than the base $15 a month subscription if they want more than 1 character (!!!). Looks like their website has been stripped of all its cool character too. Shame.

These games in their heyday were truly a one of a kind experience. All of the weird online socializing you see people getting on platforms like Discord, but all wrapped up around a fun RPG game that felt so much more flexible and imaginative than other online games at the time.


Elendor is up? telnet mush.elendor.net 1893

https://discord.gg/H8Xr3UF - is a discord server people migrated to.


Thanks for pointing that out. I saw reports that the telnet endpoint was down and didn't think to check.

I spent far far too much time on Elendor in the 90's and early 00's.

Wow, I'd visit play.net from time to time to get that nastalgic dopamine hit. Sad to see that's gone.

LLMs could really make this genre incredible. Too bad they probably don't have the funding to do something with it.


No, they could not actually.

The important thing is to relate to other humans and to be sure of what kind of human you're interacting with: the creators of the game or other players.

The staleness that actually "shipping once"[0] gives is precisely the space where human player creativity grows and thrives in.

---- [0] I understand you can get the similar results and better base games if you patch things occasionally, but constant patches[1] hides the jank and repetitiveness with novelty.

[1] And dynamically creating "content" with LLMs is like a constant stream of patches.


How would they? LLM can generate content, sure, but do MUDs need more content?

Yes.

A new MUD needs a way to build several thousand rooms, mobs, items, etc. LLMs can help with that process, though I wouldn’t trust them alone with things like balance.

Similarly, existing MUDs adding new areas need hundreds of rooms, mobs, items, etc. In my experience MUDs tend to stagnate when there’s no new content for long time players.


Some of the coolest MUDs I played in had effectively only two useful rooms, and no mobs or items to really speak of. They were barely more than a couple of IRC chat rooms, but with the ANSI colors support and complex script languages a MUD Engine directly over telnet could provide to a good MUD client.

There were far more genres of MUDs than just the Diku-style ("EverQuest-like", to use as analogy the graphic MMO that took a lot from the Diku-style of MUD) that needed to be "endless" content farms of mobs and items and new areas full of more mobs and items.

But also many of the fan favorite Diku-style MUDs were procedurally generated and no one was actually building all those thousands of rooms/mobs/items by hand even then. In theory you could use an LLM as a part of procedural generation process, but that's not the kind of content I would have wanted from a good MUD at the time I was heaviest playing MUDs. (But then I also didn't play many Diku-style/Diku-inspired MUDs, either. I was more on the Socializer side of things at the time.)


I mostly played “Everquest-like” ones.

I’ll admit YMMV and my comment should’ve been better scoped — but it sounds like you’re not disagreeing that for those, LLMs are useful in the way I suggested.


Keep your AI slop out of my human-crafted adventures, please.

If you want an LLM-created text adventure, by all means, go and enjoy that yourself. I want no part of it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: