Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

BUILD AI has a post about this and in particular sharding k-v cache across GPUs, and how network is the new memory hierarchy:

https://buildai.substack.com/p/kv-cache-sharding-and-distrib...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: