Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
cubefox
11 months ago
|
parent
|
context
|
favorite
| on:
Theoretical limitations of multi-layer Transformer
Loosely related thought: A year ago, there was a lot of talk about the Mamba SSM architecture replacing transformers. Apparently that didn't happen so far.
thesz
11 months ago
[–]
Just like with neural networks and Adam [1], LLMs evolve to make transformers their best building block.
[1]
https://parameterfree.com/2020/12/06/neural-network-maybe-ev...
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: