Hacker Newsnew | past | comments | ask | show | jobs | submit | sigbottle's commentslogin

the only actual humans in the loop here are the startup founders and engineers. pretty cut and dry case here

unless you want to blame the AI itself, from a legal perspective?


The etymology of the "markov property" is that the current state does not depend on history.

And in classes, the very first trick you learn to skirt around history is to add Boolean variables to your "memory state". Your systems now model, "did it rain The previous N days?" The issue obviously being that this is exponential if you're not careful. Maybe you can get clever by just making your state a "sliding window history", then it's linear in the number of days you remember. Maybe mix the both. Maybe add even more information .Tradeoffs, tradeoffs.

I don't think LLMs embody the markov property at all, even if you can make everything eventually follow the markov property by just "considering every single possible state". Of which there are (size of token set)^(length) states at minimum because of the KV cache.


The KV cache doesn't affect it because it's just an optimization. LLMs are stateless and don't take any other input than a fixed block of text. They don't have memory, which is the requirement for a Markov chain.


Have you ever actually worked with a basic markov problem?

The markov property states that your state is a transition of probabilities entirely from the previous state.

These states, inhabit a state space. The way you encode "memory" if you need it, e.g. say you need to remember if it rained the last 3 days, is by expanding said state space. In that case, you'd go from 1 state to 3 states, 2^3 states if you needed the precise binary information for each day. Being "clever", maybe you assume only the # of days it rained, in the past 3 days mattered, you can get a 'linear' amount of memory.

Sure, a LLM is a "markov chain" of state space size (# tokens)^(context length), at minimum. That's not a helpful abstraction and defeats the original purpose of the markov observation. The entire point of the markov observation is that you can represent a seemingly huge predictive model with just a couple of variables in a discrete state space, and ideally you're the clever programmer/researcher and can significantly collapse said space by being, well, clever.

Are you deliberately missing the point or what?


> Sure, a LLM is a "markov chain" of state space size (# tokens)^(context length), at minimum.

Okay, so we're agreed.


For me at least, I wasn't even under the impression that this was a possible research angle to begin with. Crazy stuff that people are trying, and very cool too!


If nothing else, that's a cool ass hypothesis.


Sorry, I'm not following the gun analogies at all

But regardless, I thought the point was that...

> The problem with using them is that humans have to review the content for accuracy.

There are (at least) two humans in this equation. The publisher, and the reader. The publisher at least should do their due diligence, regardless of how "hard" it is (in this case, we literally just ask that you review your OWN CITATIONS that you insert into your paper). This is why we have accountability as a concept.


Oh wow, nice catch in the article, jesus.


I don't know why, it's just an irrational form of first-principles admiration for me.

This is especially true in the age of LLM's (but the same can be applied to social media forums and the like). Sure, we should "just judge arguments on their merit" but there's something... suspicious. Like, a thought experiment: What if something came to a very reasonable seeming argument in 10 minutes, versus 10 hours? To me, I can't help but feel suspicious that I'm being tricked by some ad-hoc framing that is complete bogus in reality. "Obvious" conclusions can be obviously shaped with extremely hidden premises, things can be "locally logically correct" but horrible from a global view.

Maybe I'm way too cynical of seeing the same arguments over and over, people just stripping out their view of the elephant that they intuited in 5 minutes, then treating it as an authoritative slice, and stubbornly refusing to admit that that constraint, is well, a constraint, and not an "objective" slice. Like, yes, within your axioms and model, sure, but pretending like you found a grand unification in 5 minutes is absurd, and in practice people behave this way online.

(Point being that, okay, even if you don't buy that argument when it comes to LLM's, when it comes to a distributed internet setting, I feel my intuition there holds much stronger, for me at least. Even if everybody was truly an expert, argument JITing is still a problem).

Of course, in practice, when I do decide something is "valuable" enough for me to look at, I take apart the argument logically to the best of my ability, etc. but I've been filtering what to look at a lot more aggressively based on this criteria. And yes it's a bit circular, but I think I've realized that with a lot of really complicated wishy-washy things, well, they're hard for a reason :)

All that to say, is that yeah, the human element is important for me here :D. I find that, when it comes to consumption, if the person is a singular human, it's much harder to come to that issue. They at least have some semblance of consistence, and it's "real/emergent" in a sense. The more you learn about someone, the more they're truly unique. You can't just JIT a reductionist argument in 10 minutes.

IDK. Go small blogs!


You took a very specific argument, abstracted it, then posited your worldview.

What do you have to say about the circular trillions of dollars going around 7 companies and building huge data centers and expecting all smaller players to just subsidize them?

Sure, you can elide the argument by saying, "actually that doesn't matter because I am really smart and understood what the author really was talking about, let me reframe it properly".

I don't really have a response to that. You're free to do what you please. To me, something feels very wrong with that and this behavior in general plagues the modern Internet.


I don't think knuth does modern TCS stuff, the "old guard" (80s-ish) was focused on either classical algorithms / combinatorics, or the start of systems programming (db, network, os). Yes, Knuth did quite a bit of math in TAOCP, but they're very much "old" techniques.

Modern TCS is about unifying a lot of the ad-hoc approaches of old, as well as analyzing different models of computation that better model reality (EMM, streaming, distributed, etc).

I like both.


The issue is that real life is not adaptable. Resources and capital are slow.

That's the whole issue with monopolies for example, innit? We envision "ideal free market dynamics" yet in practice everybody just centralizes for efficiency gains.


> The issue is that real life is not adaptable. Resources and capital are slow. > That's the whole issue with monopolies for example, innit?

The much bigger issue with monopolies is that there is no pressure on the monopolist to compete on price or quality of the offering.


Right, and my point is that "ideal free market dynamics" conveniently always ignore this failure state that seems to always emerge as a logical consequence of its tenets.

I don't have a better solution, but it's a clear problem. Also, for some reason, more and more people (not you) will praise and attack anyone who doesn't defend state A (ideal equilibrium). Leaving no room to point out state B as a logical consequence of A which requires intervention.


The definition of a monopoly basically resolves to "those companies that don't get pressured to meaningfully compete on price or quality", it's a tautology. If a firm has to compete, it doesn't remain a monopoly. What's the point you're making here?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: