More

joaogui1 · 2025-12-12T00:52:00 1765500720

During pre-training the model is learning next-token prediction, which is naturally additive. Even if you added DEL as a token it would still be quite hard to change the data so that it can be used in a mext-token prediction task Hope that helps

joaogui1 · 2025-12-09T17:19:24 1765300764

HN has been used to train LLMs for a while now, I think it was in the Pile even

never_inline · 2025-12-09T17:55:14 1765302914

It has also fetched the current page in background. Because the jepsen post was recently on front page.

morkalork · 2025-12-09T17:21:52 1765300912

I may die but my quips shall live forever

joaogui1 · 2025-11-18T18:45:28 1763491528

Probably figured out the exact cause of the bug but not how to solve it

joaogui1 · 2025-11-18T16:45:53 1763484353

It says Gemini App, not AI Overviews, AI Mode, etc

recitedropper · 2025-11-18T17:00:57 1763485257

They claim AI overviews as having "2 billion users" in the sentences prior. They are clearly trying as hard as possible to show the "best" numbers.

bitpush · 2025-11-18T23:39:34 1763509174

> They are clearly trying as hard as possible to show the "best" numbers.

This isnt a hottake at all. Marketing (iPhone keynotes, product launches) are about showing impressive numbers. It isnt a gotcha you think it is.

recitedropper · 2025-11-20T15:39:30 1763653170

Sure, but the extent to which you bend the truth to get those impressive numbers is absolutely gotcha-able.

Showing a new screen by default to everyone who is using your main product flow and then claiming that everyone who is seeing it is a priori a "user" is absurd. And that is the only way they can get to 2 billion a month, by my estimation.

They could put a new yellow rectangle at the top of all google search results and claim that the product launch has reached 2 billion monthly users and is one of the fastest-growing products of all time. Clearly absurd, and the same math as what they are saying here. I'm claiming my hottake gotcha :)

joaogui1 · 2025-11-08T14:55:05 1762613705

Also bizarre that it got to the front page of HN while being so low quality :/

dustymcp · 2025-11-08T15:01:03 1762614063

Well i think that is why it got there people really love hating :)

joaogui1 · 2025-11-01T14:45:00 1762008300

Anthropic has amazing scientists and engineers, but when it comes to results that align with the narrative of LLMs being conscious, or intelligent, or similar properties, they tend to blow the results out of proportion

Edit: In my opinion at least, maybe they would say that if models are exhibiting that stuff 20% of the time nowadays then we’re a few years away from that reaching > 50%, or some other argument that I would disagree with probably

joaogui1 · 2025-10-06T17:42:44 1759772564

It's their lab notes, so it's exploring a general idea, but they're also referencing previous software they've built (like crosscut)

joaogui1 · 2025-09-03T21:14:27 1756934067

Ads on ChatGPT as a way to extract more money from users

SchemaLoad · 2025-09-04T02:38:52 1756953532

And I'm betting they won't be shown as a clearly marked box that says "Ad". It'll be woven directly in the response like normal content.

jamesbrek98 · 2025-09-04T17:10:43 1757005843

They are next. Perplexity's Comet browser tracks the ** out of you already because ads.

joaogui1 · 2025-08-23T15:13:02 1755961982

Not necessarily meaningless, but maybe relative, i.e. a person who generally replaces non-Apple laptops every X years would replace MacBooks every Y years, with Y > X

joaogui1 · 2025-08-01T16:22:34 1754065354

Mixture of Experts isn't using multiple models with different specialties, it's more like a sparsity technique, where you massively increase the number of parameters and use only a subset of the weights in each forward pass.