It blows up at 2^n for Markov chains actually. Eg imagine input of red followed ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		AnotherGoodName 3 months ago \| parent \| context \| favorite \| on: Markov chains are the original language models It blows up at 2^n for Markov chains actually. Eg imagine input of red followed by 32bits or randomness followed by blue forever. Markov chains would learn red leads to blue 32bits later. They’d just need to learn 2^32 states.

fooker 3 months ago [–]

Yep, the leap away from this exponential blowup is what has made LLMs possible.

A few more leaps and we should eventually get models small enough to get close to information theoretic lower bounds of compression.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact