Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

you can try llama.cpp with a small model, a 4bit 7B model I suggest. They run slow on my M1 MacBook with 16GB of ram, so if it does work it will be quite painful.

I run the 30B 4bit model on my M2 MacMini 32GB and it works okay, the 7B model is blazingly fast on that machine.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: