More

goldemerald · 2025-11-19T20:41:52 1763584912

Algorithmically served short form videos is clearly the smoking of our time. I cannot stand the conservative view of "well we don't know the videos cause mental health decline, or if it's simply those with a genetic inclination who seek out short form content.", exactly mirroring the skeptics about smoking causing cancer. I'm hopeful that in 5-10 years (but more likely 20) people will view this AI served, maximally engaging, content in the same way we view smoking now: disgusting and horrible, but adults should be allowed to do what they want. I can easily imagine kids/teens sharing their illicit access to shorts much in the same way they share vapes/cigarettes, which would be a much more preferable situation than the unlimited use we see today.

ge96 · 2025-11-19T20:45:49 1763585149

Oh man, they take random people's clips, stitch them together with a voice over and include false information eg. an incorrect fact about an animal

nxor · 2025-11-19T21:29:21 1763587761

I hope you don't mean conservative in the usual political sense. I know more conservatives worried about this than not.

goldemerald · 2025-06-20T21:06:13 1750453573

No discussion with Schmidhuber is complete without the infamous debate at NIPS 2016 https://youtu.be/HGYYEUSm-0Q?t=3780 . One of my goals as a ML researcher is to publish something and have Schmidhuber claim he's already done it.

But more seriously, I'm not a fan of Schmidhuber because even if he truly did invent all this stuff early in the 90s, he's inability to see its application to modern compute held the field back by years. In principle, we could have had GANs and self-supervised models' years earlier if he had "revisited his early work". It's clear to me no one read his early paper's when developing GANs/self-supervision/transformers.

Vetch · 2025-06-21T00:18:28 1750465108

> he's inability to see its application to modern compute held the field back by years.

I find Schmidhuber's claim on GANs to be tenuous at best, but his claim to have anticipated modern LLMs is very strong, especially if we are going to be awarding nobel prizes for Boltzmann Machines. In https://people.idsia.ch/%7Ejuergen/FKI-147-91ocr.pdf, he really does concretely describe a model that unambiguously anticipated modern attention (technically, either an early form of hypernetworks or a more general form of linear attention, depending on which of its proposed update rules you use).

I also strongly disagree with the idea that his inability to practically apply his ideas held anything back. In the first place, it is uncommon for a discoverer or inventor to immediately grasp all the implications of and applications of their work. Secondly, the key limiter was parallel processing power; it's not a coincidence ANNs took off around the same time GPUs were transitioning away from fixed function pipelines (and Schmidhuber's lab were pioneers there too).

In the interim, when most derided Neural networks, his lab was one of the few that kept research on Neural networks and their application to sequence learning going. Without their contributions, I'm confident Transformers would have happened later.

> It's clear to me no one read his early paper's when developing GANs

This is likely true.

> self-supervision/transformers.

This is not true. Transformers came after lots of research on sequence learners, meta-learning, generalizing RNNs and adaptive alignment. For example, Alex Graves' work on sequence transduction with RNNs eventually led to the direct precursor of modern attention. Graves' work was itself influenced by work with and by Schmidhuber.

andy99 · 2025-06-20T21:29:05 1750454945

It's very common in science for people to have had results they didn't understand the significance of that later were popularized by someone else.

There is the whole thing with Damadian claiming to have invented MRI (he didn't) when the Nobel prize went to Mansfield and Lauterbur (see the Nobel prize part of the article). https://en.m.wikipedia.org/wiki/Paul_Lauterbur

And I've seen other less prominent examples.

It's a lot like the difference between ideas and execution and people claiming someone "stole" their idea because they made a successful business from it.

godelski · 2025-06-20T22:25:35 1750458335

  > if he had "revisited his early work".

Given that you're a researcher yourself I'm surprised by this comment. Have you not yourself experienced the harsh rejection of "not novel"? That sounds like a great way to get stuck in review hell. (I know I've experienced this even when doing novel things just by too closely relating it to other methodologies when explaining "oh, it's just ____").

The other part seems weird too. Who isn't upset when their work doesn't get recognized and someone else gets credit. Are we not all human?

nextos · 2025-06-20T22:08:08 1750457288

I think he did understand both the significance of his work and the importance of hardware. His group pioneered porting models to GPUs.

But personal circumstances matter a lot. He was stuck at IDSIA in Lugano, i.e. relatively small and not-so-well funded academia.

He could have done much better in industry, with access to lots of funding, a bigger headcount, and serious infrastructure.

Ultimately, models matter much less than infrastructure. Transformers are not that important, other architectures such as deep SSMs or xLSTM are able to achieve comparable results.

cma · 2025-06-20T23:04:30 1750460670

His group actually used GPUs early (earlier) on and won a competition but didn't get the same press.

chermi · 2025-06-21T00:51:26 1750467086

I don't understand how he's at fault for the field being behind where it maybe could've been, especially the language "held back"? Did he actively discourage people in against trying his ideas as compute grew?

goldemerald · 2025-05-20T18:17:00 1747765020

This is an interesting line of research but missing a key aspect: there's (almost) no references to the linear representation hypothesis. Much work on neural network interpretability lately has shown individual neurons are polysemantic, and therefore practically useless for explainability. My hypothesis is fitting linear probes (or a sparse autoencoder) would reveal linearly semantic attributes.

It is unfortunate because they briefly mention Neel Nanda's Othello experiments, but not the wide array of experiments like the NeurIPS Oral "Linear Representation Hypothesis in Language Models" or even golden gate Claude.

akarshkumar0101 · 2025-05-20T18:56:51 1747767411

We mention this issue exactly in the fourth paragraph in Section 4 and in Appendix F!

goldemerald · 2025-05-20T19:07:39 1747768059

That is addressing the incomprehensibility of PCA and applying a transformation to the entire latent space. I've never found PCA to be meaningful for deep learning. As far as I can tell, polysemous issue with neurons cannot be addressed with a single linear transformation. There is no sparse analysis (via linear probes or SAEs) and hence the unaddressed issue.

ipunchghosts · 2025-05-20T18:40:48 1747766448

Is what your saying imply that there is a rotation matrix you can apply to each activation output to make it less entangled?

goldemerald · 2025-05-20T18:47:25 1747766845

Not quite. For an underlying semantic concept (e.g., smiling face), you can go from a basis vector [0,1,0,...,0] to the original latent space via a single rotation. You could then induce said concept by manipulating the original latent point by traversing along that linear direction.

ipunchghosts · 2025-05-20T18:52:48 1747767168

I think we are saying the same thing. Please correct me though where I am wrong. You could look at the maps in some way but instead of the basis being one hot dimensions (the standard basis), it could be rotated.

akarshkumar0101 · 2025-05-20T18:57:07 1747767427

We mention this issue exactly in the fourth paragraph in Section 4 and in Appendix F!

goldemerald · 2025-03-12T04:20:54 1741753254

I've been lightly following this type of research for a few years. I immediately recognized the broad idea as stemming from the lab of the ridiculously prolific Stefano Ermon. He's always taken a unique angle for generative models since the before times of GenAI. I was fortunate to get lunch with him in grad school after a talk he gave. Seeing the work from his lab in these modern days is compelling, I always figured his style of research would break out into the mainstream eventually. I'm hopeful the the future of ML improvements come from clever test-time algorithms like this article shows. I'm looking forward to when you can train a high quality generative model without needing a super cluster or webscale data.

imjonse · 2025-03-12T06:26:19 1741760779

Some of their research had already broken out into mainstream, DDIM at least was their paper and probably others too in the diffusion domain.

goldemerald · on Jan 14, 2025

DinoV2 is an unsupervised model. It learns both a high quality global image representation and local representations with no labels. It's becoming strikingly clear that foundation models are the go to choice for common data types of natural images, text, video, and audio. The labels are effectively free, the hard part now is extracting quality from massive datasets.

goldemerald · on Dec 28, 2024

While I love XAI and am always happy to see more work in this area, I wonder if other people use the same heuristics as me when judging a random arxiv link. This paper has one author, was not written in latex, and no comment referencing a peer reviewed venue. Do other people in this field look at these same signals and pre-judge the paper negatively?

I did attempt to check my bias and skim the paper, it does seem well written and takes a decent shot towards understanding LLMs. However, I am not a fan of black-box explanations, so I didn't read much (I really like Sparse autoencoders). Has anyone else read the paper? How is the quality?

mnky9800n · on Dec 28, 2024

I think that we should not accept peer review as some kind of gold standard anymore for several reasons. These are my opinions based on my experience as a scientist for the last 11 years.

- its unpaid work and often you are asked to do it too much and therefore may not give your best effort

- editors want to have high profile papers and minimise review times so glossy journals like nature or science often reject things that require effort on the review

- the peers doing a review are often anything but. I have seen self professed machine learning “experts” not know the difference between regression and classification yet proudly sign their names to their review. I’ve seen reviewers ask you to write prompts that are mean and cruel to an LLM to see if it would classify test data the same (text data from geologists writing about rocks). As an editor I have had to explain to adult tenured professor that she cannot write in her review that the authors were “stupid” and “should never be allowed to publish again”.

chongli · on Dec 28, 2024

A further issue is peer review quid pro quo corruption. The reviewer loves your paper but requests one small change: cite some of his papers and he’ll approve your paper.

I don’t know how prevalent this sort of corruption is (I haven’t read any statistical investigations) but I have heard of researchers complaining about it. In all likelihood it’s extremely prevalent in less reputable journals but for all we know it could be happening at the big ones.

The whole issue of citations functioning like a currency recalls Goodhart’s Law [1]:

”When a measure becomes a target, it ceases to be a good measure.”

[1] https://en.wikipedia.org/wiki/Goodhart's_law

mnky9800n · on Dec 28, 2024

Tbh I used to have an issue with that but these days it really is a small issue in the grand scheme of things. You can say No but also, there are larger systemic problems in science.

chongli · on Dec 28, 2024

You’re right. It’s more of a symptom of the systemic problems than the main problem itself. But it still contributes to my distrust in science.

3abiton · on Dec 28, 2024

Scientific peer review is another facit of civilization that its current design does not allow it to scale well. More and more people are being involved in the process, but the qualityis forever going down.

mnky9800n · on Dec 28, 2024

Yes that’s right. It’s a scaling problem and there isn’t a clear answer. It’s easy to complain about it though haha. I think what is happening is science is atomitizing. People are publishing smaller amounts or simply creating ideas from nothing (like that science advances paper on hacker news a couple days ago that created a hydrogen rich crust from thin air).

cauliflower2718 · on Dec 28, 2024

It looks like it's written in latex to me. Standard formatting varies across departments, and the author is in the business school at CMU.

In some fields, single author papers are more common. Also, outside of ML conference culture, the journal publication process can be pretty slow.

Based on the above (which is separate from an actual evaluation of the paper), there are no immediate red flags.

Source: I am a PhD student and read papers across stats/CS/OR.

ersiees · on Dec 28, 2024

Another clue: there is no way to download the latex, while you can if someone uploaded the latex on arxiv.

cauliflower2718 · on Dec 28, 2024

There's a lazy way to submit to arxiv, which is to submit just the PDF, even if you did it in latex. Sometimes it can be annoying to organize the tex files to submit to arxiv. It's uncommon, but the font and math rendering are the standard latex font.

woolion · on Dec 28, 2024

The Latex feel comes in good part from the respect for typographical standards that is encoded as default behaviour. In this document, so many spacings are just flat-out wrong, first paragraph indents, etc. If it's indeed Latex (it kinda looks like it), someone worked hard to make it look bad.

The weirdest thing is that copy-paste doesn't work; if I copy the "3.1" of the corresponding equation, I get " . "

refulgentis · on Dec 28, 2024

> I wonder if other people use the same heuristics as me when judging a random arxiv link.

My prior after the header was the same as yours. The fight and interesting part is in the work past the initial reaction.

i.e. if I react with my first order, least effort, reaction, your comment leaves the reader with a brief, shocked, laugh at you seemingly doing performance art. A seemingly bland assessment and overly broad question...only to conclude with "Has anyone else read the paper? Do you like it?"

But that's not what you meant. You're geniunely curious if its a long tail, inappropriate, reaction to have that initial assessment based on pattern matching. And you didn't mean "did anyone else read it", you meant "Humbly, I'm admitting I'm skimmed, but I wasn't blown away for reasons X, Y, and Z. What do you all think? :)"

The paper is superb and one of the best I recall reading in recent memory.

It's a much whiter box than Spare Autoencoders. Handwaving what a bag of floats might do in general is much less interesting or helpful than being able to statistically quantify the behavior of the systems we're building.

The author is a PhD candidate at the Carnegie Mellon School of Business, and I was quite taken with their ability to hop across fields to get a rather simple and important way to systematically and statistically review the systems we're building.

apstroll · on Dec 28, 2024

This paper is doing exactly that though, handwaving with a couple of floats. The paper is just a collection of observations about what their implementation of shapley value analysis gives for a few variations of a prompt.

refulgentis · on Dec 28, 2024

You have an excellent point. Bear with me.

I realized when writing this up that saying SAE isn't helpful but this is comes across as perhaps devils advocating. But I came across this in a stream of consciousness while writing, so I had to take a step back and think through it before editing it out.

Here is that thinking:

If I had a model completely mapped using SAE, at most, I can say "we believe altering this neuron will make it 'think' about the golden gate bridge more when it talks" ---- that's really cool for mutating behavior, don't get me wrong, it's what my mind is drawn to as an engineer.

However, as a developer of LLMs, through writing the comment, I realized SAE isn't helpful for qualifying my outputs.

For context's sake, I've been laboring on a LLM client for a year with a doctor cofounder. I'm picking these examples because it feels natural, not to make them sound fancy or important

Anyways, let's say he texts me one day with "I noticed something weird...every time I say 'the patient presents with these symptoms:' it writes more accurate analyses"

With this technique, I can quantify that observation. I can pull 20 USMLE questions and see how it changes under the two prompts.

With SAE, I don't really have anything at all.

There's a trivial interpretation of that: ex. professionals are using paid LLMs, and we can't get SAE maps.

But there's a stronger interpetation too: if I waved a magic wand and my cofounder was running Llama-7-2000B on their phone, and I had a complete SAE map of the model, I still wouldn't be able to make any particular statement at all about the system under test, other than "that phrase seems to activate these neurons" -- which would sound useless / off-topic / engineer masturbatory to my cofounder.

But to my engineering mind, SAE is more appealing because it reveals how it works fundamentally. However, I am overlooking that it still doesn't say how it works, just a unquantifiable correlation between words in a prompt and what floats get used. To my users, the output is how it works.

johndough · on Dec 28, 2024

Two more heuristics:

1. The figures are not vectorized (text in figures can not be selected). All it takes is to replace "png" in `plt.savefig("figure.png")` with "pdf", so this is a very easy fix. Yet the author did not bother, which shows that he either did not care or did not know.

2. The equations lack punctuation.

Of course you can still write insightful papers with low quality figures and unusual punctuation. This is just a heuristic after all.

chongli · on Dec 28, 2024

I didn’t even read the paper, I just read the abstract. I was really impressed by the idea of using Shapley values to investigate how each token in a prompt affects the output, including order-based effects.

Even if the paper itself is rubbish I think this approach to studying LLMs at least warrants a second look by another team of researchers.

goldemerald · on Dec 23, 2024

Why not actually release the weights on huggingface? The popular SAE_lens repo has a direct way to upload the weights and there are already hundreds publicly available. The lack of training details/dataset used makes me hesitant to run any study on this API.

Are images included in the training?

What kind of SAE is being used? There have been some nice improvements in SAE architecture this last year, and it would be nice to know which one (if any) is provided.

tMcGrath · on Dec 23, 2024

We're planning to release the weights once we do a moderation pass. Our SAE was trained on LMSys (you can see this in our accompanying post: https://www.goodfire.ai/papers/mapping-latent-spaces-llama/).

No images in training - 3.3 70B is a text-only model so it wouldn't have made sense. We're exploring other modalities currently though.

SAE is a basic ReLU one. This might seem a little backwards, but I've been concerned by some of the high-frequency features in TopK and JumpReLU SAEs and the recent SAE (https://arxiv.org/abs/2407.14435, Figure 14), and the recent SAEBench results (https://www.neuronpedia.org/sae-bench/info) show quite a lot of feature absorption in more recent variants (though this could be confounded by a number of things). This isn't to say they're definitely bad - I think it's quite likely that TopK/JumpReLU are an improvement, but rather that we need to evaluate them in more detail before pushing them live. Overall I'm very optimistic about the potential for improvements in SAE variants, which we talk a bit about at the bottom of the post. We're going to be pushing SAE quality a ton now we have a stable platform to deploy them to.

goldemerald · on May 29, 2024

"Ready to -dive- delve in?" is an amazingly hilarious reference. For those who don't know, LLMs (especially ChatGPT) use the word delve significantly more often than human created content. It's a primary tell-tale sign that someone used an LLM to write the text. Keep an eye out for delving, and you'll see it everywhere.

threeseed · on May 29, 2024

I believe this was debunked as just a US-centric view.

In places that grew up learning UK English we use delve not that dissimilar to ChatGPT.

7d7n · on May 29, 2024

haha I'm glad you noticed!

it's originally "Ready to ~~delve~~ dive in?" but something got lost in translation

goldemerald · on May 17, 2024

Solution 4 is so hilariously bad I am shocked it was suggested. Building a 2d landscape where the time dimension seems to take a random walk made laugh a lot. Ignoring the standard convention of "independent variable on x-axis" and instead embedding it as datapoints is a particularly clever way to obfuscate the data and confuse the reader.

msm_ · on May 17, 2024

I don't agree. It's a great way to visualise data when you want to focus on a trend. It makes it very obvious which "direction" is the data heading. But of course it is not very often used, is not a great fit for every use case (in particular, bad for the data in OP) and may be confusing when seen first time.

sixthlime · on May 17, 2024

I thought so at first too, but if you look at the link they included [1] it seems like it can actually be quite clear for some datasets

[1] https://archive.nytimes.com/www.nytimes.com/imagepages/2010/...

goldemerald · on May 8, 2024

Great work! I'd love to start using the language model variant of your work. Do you know when/if it will be open sourced? I'd start using it today if it were that soon.