Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

unfortunately disabling temperature / switching to greedy sampling doesn't necessarily make most LLM inference engines _fully_ deterministic as parallelism and batching can result in floating point error accumulating differently from run to run - it's possible to make them deterministic but does come with a perf hit

some providers _do_ let you set the temperature, including to "zero", but most will not take the perf hit to offer true determinism



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: