Among other reasons, if you turn temperature down to 0, llms stop working. Like they don't give natural language answers to natural language questions any more, they just halt immediately. Temperature gives the model wiggle room to emit something plausible sounding rather than clam up when presented an input that wasn't verbatim in the training data (such as the system prompt).
Yes but that doesn't explain why we aren't given a choice. Program code is boringly deterministic but in many cases it's exactly what you need while non-determinism becomes your dangerous enemy (like in the case of some Airbus jets being susceptible to bit flips under cosmic rays)
The current way to address this is through RAG applications or Retrieval Augmented Generation. This means using the LLM side for the natural language non-deterministic portion and using traditional code and databases and files for the deterministic part.
A good example is bank software where you can ask what your balance is and get back the real number. A RAG app won't "make up" your balance or even consult the training the find it. Instead, the traditional code (deterministic) operations are done separately from the LLM calls.