How large of a model can you use with your 128GB M3? Anything you can tell would... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		natch on Dec 30, 2024 \| parent \| context \| favorite \| on: How I run LLMs locally How large of a model can you use with your 128GB M3? Anything you can tell would be great to hear. Number of parameters, quantization, which model, etc.

ein0p on Jan 9, 2025 [–]

I'm running 123B parameter Mistral Large with no issues. Larger models will run, too, but slowly. I wish Ollama had support for speculative decoding.

natch on Jan 11, 2025 | [–]

Thanks for the reply. Is that quantized? And what's the bit size of the floating point values in that model (apologies if I'm not asking the question correctly).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact