For reference, the details about how the LLMs are queried: "How the players work...

For reference, the details about how the LLMs are queried:

"How the players work

    All players use the same system prompt
    Each time it's their turn, or after a hand ends (to write a note), we query the LLM
    At each decision point, the LLM sees:
        General hand info — player positions, stacks, hero's cards
        Player stats across the tournament (VPIP, PFR, 3bet, etc.)
        Notes hero has written about other players in past hands
    From the LLM, we expect:
        Reasoning about the decision
        The action to take (executed in the poker engine)
        A reasoning summary for the live viewer interface
    Models have a maximum token limit for reasoning
    If there's a problem with the response (timeout, invalid output), the fallback action is fold"

The fact the models are given stats about the other models is rather disappointing to me, makes it less interesting. Would be curious how this would go if the models had to only use notes/context would be more interesting. Maybe it's a way to save on costs, this could get expensive...