Hacker Newsnew | past | comments | ask | show | jobs | submit | lpapez's commentslogin

Very cool research and wonderfully written.

I was expecting an ad for their product somewhere towards the end, but it wasn't there!

I do wonder though: why would this company report this vulnerability to Mozilla if their product is fingeprinting?

Isn't it better for the business (albeit unethical) to keep the vulnerability private, to differentiate from the competitors? For example, I don't see many threat actors burning their zero days through responsible disclosure!


We don't use vulnerabilities in our products.

I don't understand what you mean. What separates this from other fingerprinting techniques your company monetizes?

No software wants to be fingerprinted. If it did, it would offer an API with a stable identifier. All fingerprinting is exploiting unintended behavior of the target software or hardware.


It makes sense to me, they're likely not trying to actually fingerprint Tor users. Those users will likely ignore ads, have JS disabled, etc. the real audience is people on the web using normal tooling.

Uhh okay, so they do exploit vulnerabilities, they just try to target victims who can be served ads? What a weird distinction.

Painting fingerprinting as vulnerability exploit is your own very biased and very out-of-norm framing.

Well presumably they want to make money.

Side channels that enable intended behavior, versus a flat-out bug like the above, though the line can often be muddied by perspective.

An example that comes to mind that I've seen is an anonymous app that allows for blocking users; you can programmatically block users, query all posts, and diff the sets to identify stable identities. However, the ability to block users is desired by the app developers; they just may not have intended this behavior, but there's no immediate solution to this. This is different than 'user_id' simply being returned in the API for no reason, which is a vulnerability. Then there's maybe a case of the user_id being returned in the API for some reason that MIGHT be important too, but that could be implemented another way more sensibly; this leans more towards vulnerability.

Ultimately most fingerprinting technologies use features that are intended behavior; Canvas/font rendering is useful for some web features (and the web target means you have to support a LOT of use cases), IP address/cookies/useragent obviously are useful, etc (though there's some case to be made about Google's pushing for these features as an advertising company!).


Iffy vs grossly unethical.

The real reason is that fingerprint.com's selling point is tracking over longer periods (months, their website claims), and this doesn't help them with that.

So it's the criminal that convinced themselves they are the good guys, I didn't expect that one. You are a malware company get a grip.

Would you prefer that they kept this for themselves instead of disclosing it?

I get criticizing their business and what they do wrong, but doesn't seem right to criticizing them for doing the right thing.


It means they are suspect. I think its right to be wary of motives if they are involved in the very thing they aim to bring awareness too. Questions arise in my mind as to why they would do something like this in the first place.

Its been my experience that the general public doesn't seem to follow patterns and instead focus on which switch is toggled at any given moment for a company's ethical practices. This is the main reason why we are constantly gamed by orgs that have a big picture view of crowd psychology.


They probably are not relying on it and disclosure means others can't either.

Recommendations from past workplaces and networking. Honestly never heard of anyone else being hired as a solo contributor outside those channels.

>hype-driven commercial product

>single-handedly keeping PHP relevant

While architecture astronauts are clutching pearls, I've built multiple profitable products with Laravel without caring the slighest about the internals, both before and after AI.

PHP was always all about just building stuff while ignoring code quality. Laravel is a natural extension of that approach. Let us live.


No, Symfony is singlehandedly keeping PHP relevant, to the point that every other framework depends on its packages, Laravel included.

Most people like you who don't care about code quality and want to "just build" another B2B SaaS unmaintainable pile of spaghetti are now purely relying on AI and not writing any code themselves anymore, so why use PHP at all instead of JS like all the other vibe coders?


> so why use PHP at all instead of JS like all the other vibe coders?

Because there is nothing remotely close to Laravel for JS. I don't want to think about auth, job queues, mailing, cache layers, auditing etc. I want an opinionated default from my framework that is thoroughly documented and part of the AI training corpus. Laravel gives that to me.


> Agile just finally embraced that specs are incomplete and can even be wrong because the writer of the spec does not yet really know or understand what they want. So they need working software to show the spec in action and then we can iterate on the results.

I agree, but what you describe is agile, not Agile (capital A).

Agile (capital A) is Scrum (capital S) where you have Backlog Grooming (patent pending) where the team clears any ambiguity to define a spec (ticket).

Deviating from said spec is seen as Scope Creep (gasp) and might lead to complaints during Sprint Review (trademark).

So yes, agile prefers working software over detailed spec. But typical manifestations of Agile (capital A) are exactly the opposite.


The US public discourse is so dehumanized today that anyone who is not "with them" is literally not a human anymore. Even within the country itself "the leftards" are considered an obstacle which can be removed if only enough force is applied.

Sending armed agents at protesters is seen as being the same thing as sending pest control to clear out beaver dams on the creek. Nobody cares what the beavers think, they are not human, they do not have feelings. They are simply a menace to be dealth with.


The supporters of imperialism all about nonviolent protest and democratic principles if it seems feasible it could bring about US foreign policy goals: https://news.ycombinator.com/item?id=47111067

Or, if an anonymous and uncorroborated source claims tens of thousands of said protestors were allegedly massacred.

If it doesn't, and the strategy now involves blowing up desalinization plants ( https://apnews.com/article/trump-iran-threat-desalination-pl... ) and invoking a humanitarian crisis on the level of a nuclear catastrophe, well... then they're a bit less concerned about human rights.


[flagged]


It took 8 years the last time.


[flagged]


> The conservatives, when they protest (Tea Parties) leave public spaces in fine shape

We're just skipping Charlottesville and the Capitol? We have idiots on both of our fringes. But only one of them is in power right now.


Jan 6th.

Even the example you gave is incorrect. Lol. It's so obvious when conservatives cherry pick information to placate their views.


Pelosi's Folly? Give the other leg a pull too, please.


> The conservatives, when they protest (Tea Parties) leave public spaces in fine shape.

As long as you ignore the feces smeared on the walls and the injured police.


You can simply add a shell alias with whatever name you like and move on.


True, but easier said than done, because one often need to work in more shells than their local machines..


This is a nonstandard tool. If you can't customize your machine, you already don't have it.


But it could be one day..


Do something like this to fall back to plain grep. You will somehow have to share these configurations across machines though.

    alias g=grep
    command -v rg 2>&1/dev/null && alias g=rg


Sixty-six words about capitalizing things properly and you think this is something that learning about shell aliases can solve. Impressive.


You can't in most corporate env machines.

You may be able to download ripgrep, and execute it (!), but god forbid you can create an alias in your shell in a persistant manner.


> You can't in most corporate env machines.

Really? "most" even? What CAN you do if you can't edit files in your own $HOME?


`[citation needed]`


huh? If you can download and execute files, you can alias it. Either in your .bashrc file, or by making a symlink.


I daily drive linux, but I hop from clients to clients and I have probably served about 200 different structures so far.

Most corporate machines are Windows boxes with ps and cmd.exe heavily restricted, no admin, and anti malware software surveilling I/O like a hawk.

You might get a git bash if you are lucky, but it's usually so slow it's completely unusable.

In one client I once tried to sneak in Clink. Flagged instantly by security and reported to HR.

It's easy to forget that life outside the HN bubble is still stuck there.


How can you possibly get development work done in an environment where you can even make a Microsoft.PowerShell_profile.ps1?


You will be even more horrified to learn that installing the entire list of deps of a project that would take a few seconds on my home laptop may take up to 20 minutes at some clients because many FS calls do a network round-trip.

We are not talking about exceptions either. This is pretty standard stuff when you work outside of the IT-literate companies.

At one client, they provided me with a part time tester, they neglected to give him the permissions to install git. Took 3 weeks to fix.

The same client makes us dev on Windows machine but deploy on Linux pods. We can't directly test on the linux, nor connect to them, only deploy on it. In fact, we don't even have the specs of the pods, I had to create a whole API endpoint in the project just to be able to fetch them.

Other things I got to enjoy:

- CTO storing the passwords of all the servers in an libre office file

- lead testing in prod, as root, by copying files through ftp. No version control.

- sysadmin that had an interesting way of managing his servers: he remote controlled one particular windows machine using team viewer which ones the only one that could connect through ssh to them.

The list is quite long.

This makes you see the entire world with a whole new perspective.

I always thought that all devs should spend a year doing tech support for a variety of companies so that they get a reality check on what most humans actually have to deal with when working on a computer.

If you are on HN, you are the 1%.


Aged like already spoiled milk.


All of those matter, making this whole situation even more unjustified.


If you only realized how ridiculous your statement is, you never would have stated it.


It's also literally factually incorrect. Pretty much the entire field of mechanistic interpretability would obviously point out that models have an internal definition of what a bug is.

Here's the most approachable paper that shows a real model (Claude 3 Sonnet) clearly having an internal representation of bugs in code: https://transformer-circuits.pub/2024/scaling-monosemanticit...

Read the entire section around this quote:

> Thus, we concluded that 1M/1013764 represents a broad variety of errors in code.

(Also the section after "We find three different safety-relevant code features: an unsafe code feature 1M/570621 which activates on security vulnerabilities, a code error feature 1M/1013764 which activates on bugs and exceptions")

This feature fires on actual bugs; it's not just a model pattern matching saying "what a bug hunter may say next".


Was this "paper" eventually peer reviewed?

PS: I know it is interesting and I don't doubt Antrophic, but for me it is so fascinating they get such a pass in science.


This is more of an article describing their methodology than a full paper. But yes, there's plenty of peer reviewed papers on this topic, scaling sparse autoencoders to produce interpretable features for large models.

There's a ton of peer reviewed papers on SAEs in the past 2 years; some of them are presented at conferences.

For example: "Sparse Autoencoders Find Highly Interpretable Features in Language Models" https://proceedings.iclr.cc/paper_files/paper/2024/file/1fa1...

"Scaling and evaluating sparse autoencoders" https://iclr.cc/virtual/2025/poster/28040

"Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning" https://proceedings.neurips.cc/paper_files/paper/2024/hash/c...

"Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2" https://aclanthology.org/2024.blackboxnlp-1.19.pdf


Modern ML is old school mad science.

The lifeblood of the field is proof-of-concept pre-prints built on top of other proof-of-concept pre-prints.


Sounds like you agree this “evidence” lacks any semblance of scientific rigor?


(Not GP) There was a well recognized reproducibility problem in the ML field before LLM-mania, and that's considering published papers with proper peer-reviews. The current state of afairs in some ways is even less rigourous than that, and then some people in the field feel free to overextend their conclusions into other fields like neurosciences.


Frankly, I don't see a reason to give a shit.

We're in the "mad science" regime because the current speed of progress means adding rigor would sacrifice velocity. Preprints are the lifeblood of the field because preprints can be put out there earlier and start contributing earlier.

Anthropic, much as you hate them, has some of the best mechanistic interpretability researchers and AI wranglers across the entire industry. When they find things, they find things. Your "not scientifically rigorous" is just a flimsy excuse to dismiss the findings that make you deeply uncomfortable.


> This feature fires on actual bugs; it's not just a model pattern matching saying "what a bug hunter may say next".

You don't think a pattern matcher would fire on actual bugs?


Mechanistic interpretability is a joke, supported entirely by non-peer reviewed papers released as marketing material by AI firms.


Some people are still stuck in the “stochastic parrot” phase and see everything regarding LLMs through that lense.


Current LLMs do not think. Just because all models anthropomorphize the repetitive actions a model is looping through does not mean they are truly thinking or reasoning.

On the flip side the idea of this being true has been a very successful indirect marketing campaign.


What does “truly thinking or reasoning” even mean for you?

I don’t think we even have a coherent definition of human intelligence, let alone of non-human ones.


Everyone knows to really think you need to use your fleshy meat brain, everything else is cheating.


Oh, yes. The trope of "but what does it even mean to think".

If you can't speak, can you think? Yes. Large Language model. Thinking is not predicated on language.

A few good starts for you. Please refute all of these arguments in your spare time to convince me otherwise:

* https://machinelearning.apple.com/research/illusion-of-think... * https://archive.is/FM4y8 * https://www.raspberrypi.org/blog/secondary-school-maths-show...


My point was not that I’m 100% convinced that LLMs can think or are intelligent.

My point was that we don’t have a great definition for (human) intelligence either. The articles you posted also don’t seem to be too confident in what human intelligence actually entails.

See https://en.wikipedia.org/wiki/Intelligence

> There is controversy over how to define intelligence. Scholars describe its constituent abilities in various ways, and differ in the degree to which they conceive of intelligence as quantifiable.

Given that an LLM isn’t even human but essentially an alien entity, who can confidently say they are intelligent or not?

I’m very sceptic of those who are very convinced one or the other way.

Are LLMs intelligent in the way that humans are? I’m quite sure they aren’t.

Are LLMs just stochastic parrots? I don’t find that framing convincing anymore either.

Either way it’s not clear, just check how this topic is discussed daily in most frontpage threads for the last couple of years


Amazing idea and execution, the sort of stuff I wish there was more of on HN.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: