Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Llama scout is a 17B x 16 MOE. So that 17B active parameters. That makes it faster to run. But the memory requirements are still large. They claim it fits on an H100. So under 80GB. A mac studio at 96GB could run this. By run i mean inference, Ollama is easy to use for this. 4x3090 nvidia cards would also work but its not the easiest pc build. The tinybox https://tinygrad.org/#tinybox is 15k and you can do Lora fine tuning. Could also do a regular pc with 128gb of ram, but its would be quite slow.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: