Hacker Newsnew | past | comments | ask | show | jobs | submit | AJRF's commentslogin

This is adorable. Nice work!

How are you evaluating it against your expectations?

Lick your finger before you stick it in the air. Amplifies the signal.

I'd like to know this too. Whisper is hard to beat.

You need to make it so the map doesn't refresh when I click another pin, that's so annoying. I wanted to see how hectic my plan for London day trip would be, but I lose the locality between clicking different options in the map.

I've just spent the day reading and reviewing the absolute slop that comes out of these things :'(

Another good use case for a microservice - if you are going to have to change the compute size for your monolith just to accommodate the new functionality.

I had an architect bemoan the suggestion we use a microservice, until he had to begrudgingly back down when he was told that the function we were talking about (Running a CLIP model) would mean attaching a GPU to every task instance.


that chart at the start is egregious


Feels like a tongue-in-cheek jab at the GPT-5 announcement chart.


The attempts at controlling the narrative feel a lot more unsubtle since Musk took over.

I bet dollars to donuts that they are tipping the scale on stoking up tensions on UK users with things like migration and class division.

I only follow tech people on twitter, but if you looked at my FYP you'd think I was deeply interested in UK politics - which I am not!


https://twitter.com/settings/your_twitter_data/twitter_inter...

think you'll be surprised in what has been signed for on your behalf.


Yup. Not just x though. On insta Even a slight misstep and you’re up to your eyeballs in anti migrant content from the algo


Very uncool of Eric! Thank you for the work you've put in over the years.


Do you think sampling is deterministic?


Topk sampling with temp = 0 should be pretty much deterministic (ignoring floating-point errors)


> Ignoring floating point errors.

I think you mean non-associativity.

And you can’t ignore that.


Ignoring floating point errors, assuming a perfectly spherical cow, and taking air resistance as zero.


Imagine you are predicting the next token, you have two tokens very close in probability in the distribution, kernel execution is not deterministic because of floating point non-associativity - the token that gets predicted impacts the tokens later in the prediction stream - so it's very consequential which one gets picked.

This isn't some hypothetical - it happens all the time with LLM's - it isn't some freak accident that isn't probable


Okay yes, but would you really say that the main part of non-determinism in LLM-usage stems from this ? No its obviously the topk sampling.

I don't think my tech-lead was trying to suggest the floating-point error/non-associativity was the real source.


> Would you really say that the main part of non-determinism in LLM-usage stems from this

Yes I would because it causes exponential divergence (P(correct) = (1-e)^n) and doesn't have a widely adopted solution. The major labs have very expensive researchers focused on this specific problem.

There is a paper from Thinking Machines from September around Batch Invariant kernels you should read, it's a good primer on this issue of non-determinism in LLM's, you might learn something from it!

Unfortunately the method has quite a lot of overhead, but promising research all the same.


Alright fair enough.

I dont think this is relevant to the main-point, but it's definitely something I wasn't aware of. I would've thought it might have an impact on like O(100)th token in some negligible way, but glad to learn.


This happened to me last night! I was going to bed and I clicked Update and Shut Down, then I went in to the other room.

After a few minutes I could see the blue glow of my Windows background shining on the wall.

Glad it is fixed!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: