Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In addition to what the other commenter said about Moores law, innovations like Flash Attention which reduced memory usage by over 10x and FA 2 which made huge leaps in compute efficiency show there is still a lot of room to improve the models and inference algorithms themselves. Even without compute we likely haven’t scratched the surface of efficient transformers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: