this is is a distinction without a difference in many instances. I can easily ask an llm to write a python tool to produce random numbers for a given distribution and then use that tool as needed. The LLM writes the code, and uses the executable result. Then end black box result is the LLM doing the work
But why limit it to generating random numbers, isn't the logical conclusion that the LLM writes a poker bot instead of playing the game? How would that demonstrate the poker skills of an LLM?