Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They are working with tiny models. Not sure how well it'd scale to bigger models (if at all).


They're all LLMs, so no, not tiny, but not exactly huge either:

> Our current deployment runs in a cross-region cluster comprising 213 H20 GPUs, serving twenty-eight 1.8–7B models (TP=1) and nineteen 32–72B models (TP=4).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: