Hacker Newsnew | past | comments | ask | show | jobs | submit | robrenaud's commentslogin

They use a lightweight adapter to silently degrade the performance. Usually these adaptors are made to improve the performance for a given domain/task.

3 blue 1 brown has a great visual introduction to transformers, the heart of LLMs.

It's chapter 5. Start at chapter 1 if you want more background on neural nets and backprop.

https://youtu.be/wjZofJX0v4M?si=HFXbrB-5cArprGaU


"The reasoning is the weights."

The reasoning is in a process that uses the weights.

Sorting algorithms are just bytes. Those bytes don't sort by themselves. They do instruct a computer on how to sort though.


There is some recent work on modularizing knowledge in LLMs.

https://arxiv.org/html/2605.06663v1

It might be possible to train a big generalist that is a composition of modules, some of which can be dropped dynamically at inference time, depending on the prompt.


Cool. Thanks for sharing. I am thinking about creating a series of smaller models for specific purposes and then orchestrating them so they mirror the human brain which is a bunch of subsystems that give multiple opinions about the same stimulus


Interesting direction. I’ve also been thinking about modular / subsystem-based approaches for specialized tasks in small AI systems.


Is every American tax payer morally compromised?


Yes ;)

I agree with the intent of your rhetorical question, so I'm jesting with you. I'm justifying my "yes" with the hopefully humorous distraction that every person, including American taxpayers, has at some point made a nonsustainable/selfish (my definition of immoral) decision.


My big gripe with unions is the unwavering protection of their worst performing members.

Eg, that they necessitated so called "rubber rooms" like these in the NYC public schools, where teachers got paid to do nothing while waiting on arbitration.

https://en.wikipedia.org/wiki/Reassignment_center


I doubt you'll find many people in favor of how bad cops get protected by police unions either. At least in the US I'd much rather a broad social net so my health care and retirement weren't so directly tied to my job than a union specific to my trade.


The flat earthers are why I hate astronomy.

Afaict, the grand parent poster is just very wrong. You do want to cause acute stresses to your heart (cardiovascular exercise) to get it work better.


It’s not really about this particular claim. It’s that I can read a comment that has a reasonable chain of logic and I don’t know if it’s true. This topic is just not easily studied and theories are hard to falsify.


Claims about flat earth are falsifiable with at-home experiment.


Yeah, it's different. Anthropic profits when it delivers tokens. Hosting providers pay when Anthropic scrapes them.


Yeah, my big problem with the paper is it just might be an artifact of qwen's training process.


In all fairness most of the unique stuff I can do is probably an artifact of my training process, so it seems unfair to deny an LLM the same accomodation.


How much did your training cost society?


This got me thinking, and it might actually even be a comparable amount. Let's estimate 12 years of schooling run at minimum $100,000 per student, at least in the US [1], and then add onto that number whatever else you may do after that, i.e. a bunch more money if paid (college) or "unpaid" (self-taught skills and improvements) education, and then the likely biggest portion for white-collar workers, yet hard-to-quantify, in experience and "value" professional work will equip one with.

Now divide the average SOTA LLM's training cost (or a guess, since these numbers aren't always published as far as I'm aware) by the number of users, or if you wanted to be more strict, the number of people it's proven to be useful for (what else would training be for), and it might not be so far off anymore?

Of course, whether it makes sense to divide and spread out the LLMs' costs across users in order to calculate an "average utility" is debatable.

[1] https://www.publicschoolreview.com/average-spending-student-...


Was Alphago's move 37 original?

In the last step of training LLMs, reinforcement learning from verified rewards, LLMs are trained to maximize the probability of solving problems using their own output, depending on a reward signal akin to winning in Go. It's not just imitating human written text.

Fwiw, I agree that world models and some kind of learning from interacting with physical reality, rather than massive amounts of digitized gym environments is likely necessary for a breakthrough for AGI.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: