Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Continuous Autoregressive Language Models (arxiv.org)
3 points by guybedo 74 days ago | hide | past | favorite | 1 comment


Abstract:

we introduce Continuous Autoregressive Language Models (CALM), a paradigm shift from discrete next-token prediction to continuous next-vector prediction.

CALM uses a high-fidelity autoencoder to compress a chunk of K tokens into a single continuous vector, from which the original tokens can be reconstructed with over 99.9\% accuracy.

This allows us to model language as a sequence of continuous vectors instead of discrete tokens, which reduces the number of generative steps by a factor of K




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: