Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Loosely related thought: A year ago, there was a lot of talk about the Mamba SSM architecture replacing transformers. Apparently that didn't happen so far.


Just like with neural networks and Adam [1], LLMs evolve to make transformers their best building block.

[1] https://parameterfree.com/2020/12/06/neural-network-maybe-ev...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: