Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What do you mean there is no such thing as R1-1.5b? DeepSeek released a distilled version based on a 1.5B Qwen model with the full name DeepSeek-R1-Distill-Qwen-1.5B, see chapter 3.2 on page 14 of their research article [0].

[0] https://arxiv.org/abs/2501.12948



Which is not the same model, it's not R1 it's R1-Distill-Qwen-1.5B....


A distinction they make clear and write extensively about on the model page, yes?


wheres that made clear in "ollama run deepseek-r1” the command to download/run the model?


Which you have to go to the model page to find.


ollama labels the qwen models R1, while the "R1" moniker standing on its own in deepseek world means the full model that has nothing to do with qwen.

https://ollama.com/library/deepseek-r1

That may have been ok if it was just same model at different sizes but they're completely different things here & it's created confusion out of thin air for absolutely no reason other than ollama being careless.


And their documentation makes that distinction clear, having dedicated a section specifically to the distilled models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: