Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
blackeyeblitzar
11 months ago
|
parent
|
context
|
favorite
| on:
The Illustrated DeepSeek-R1
How does such a distillation work in theory? They don’t have weights from OpenAI’s models, and can only call their APIs, right? So how can they actually build off of it?
moralestapia
11 months ago
[–]
Like RLHF but the HF part is GPT4 instead.
KarraAI
11 months ago
|
parent
[–]
How do you ensure the student model learns robust generalizations rather than just surface-level mimicry?
moralestapia
11 months ago
|
root
|
parent
[–]
No idea as I don't work on that, but my guess would be that the higher the 'n' the more model A approaches model B.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: