Hacker Newsnew | past | comments | ask | show | jobs | submit | z3c0's commentslogin

The deposing of the Shah only took three days. Three days to create half-a-century of turmoil.

https://en.wikipedia.org/wiki/1953_Iranian_coup_d%27%C3%A9ta...


The Shah and the current state of Iran comes to mind.

No. The most similar precedent was the Panama invasion and arrest of Noriega.

..."first with Bitcoin"? Is that the narrative we're buying now? That this problem starts at Bitcoin? Not the coal-fired electrical grid fueling it?


Any proof-of-work cryptocurrency is literally a system that incentivizes runaway, self-reinforcing energy consumption like no other product in human history. Bringing it into the world is the dumbest idea that any human has ever had. I hope Satoshi is ashamed of his mental lapse.


Just purchased it. I never would have read it otherwise.


Seems like the marketing strategy for the book worked.


Don't cut yourself on that edge.

It's not terribly insightful to recognize that the publishers are trying to make the best of a bad situation.


I don't even frame my requests conversationally. They usually read like brief demands, sometimes just comma delimited technologies followed by a goal. Works fine for me, but I also never prompt anything that I don't already understand how to do myself. Keeps the cart behind the horse.


I've started putting in my system prompt "keep answers brief and don't talk in the first/second person". Gets rid of all the annoying sycophancy and stops it from going on for ten paragraphs. I can ask for more details when I need it.


Wherein the assertion was made (by exclusion) that wind and solar are not energy sources. It seems the real intention really was to cull renewables after all, though I doubt anyone is surprised.


Prompt engineers who realized that nobody is buying their bullshit.

Cleaned up of hype, it's just a JavaScript developer who spends their time arguing with APIs in a more literal fashion than those before.


Think this is a natural extension of the commodification of SWEs over the last 10-20 years as the newest easy way to make six figures


Indeed. I can't fault people for wanting to give their careers a boost in these increasingly trying times. As someone who stepped into analytics just in time to catch the wave (10 years ago), I can understand why someone would want to hop aboard.

That said, I at least took the time to learn the maths.


The statistical certainty is indeed present in the model. Each token comes with a probablility; if your softmax results approach a uniform distribution (i.e. all selected tokens at the given temp have near equal probabilities), then the next most likely token is very uncertain. Reporting the probabilities of the returned tokens can help the user understand how likely hallucinations are. However, that information is deliberately obfuscated now, to prevent distillation techniques.


That is not the same thing! You are talking about the point distribution of the next token. We are talking about the uncertainty associated with each of those candidate tokens; a distribution of distributions.

It's the difference between a categorical distribution and a Dirichlet. https://en.wikipedia.org/wiki/Dirichlet_distribution


I think we're talking about the same thing. I should be clear that I don't think the selected token probabilities being reported are enough, but if you're reporting each returned tokens probability (both selected and discarded) and aggregating the cumulative probabilities of the given context, it should be possible to see when you're trending centrally towards uncertainty.


No, it isn't the same thing. The softmax probabilities are estimates; they're part of the prediction. The other poster is talking about the uncertainty in these estimates, so the uncertainty in the softmax probabilities.

The softmax probabilities are usually not a very good indication of uncertainty, as the model is often overconfident due to neural collapse. The uncertainty in the softmax probabilities is a good indication though, and can be used to detect out-of-distribution entries or poor predictions.


Agreed. All these attempts to benchmark LLM performance based on the interpreted validity of the outputs are completely misguided. It may be the semantics of "context" causing people to anthropomorphize the models (besides the lifelike outputs.) Establishing context for humans is the process of holding external stimuli against an internal model of reality. Context for an LLM is literally just "the last n tokens". In that case, the performance would be how valid the most probablistic token was with the prior n tokens being present, which really has nothing to do with the perceived correctness of the output.


1700 directories at the project root...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: