Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  > Based on these points, it’s not technically feasible for current LLMs to play poker strongly.
To add to this a little bit it's important to note the limitations of this project. It's interesting, but I think it is probably too easy to misinterpret the results.

A few things to note:

  - It is LLMs playing against one another 
    - not against humans and not against professional humans.
    - Not an LLM being trained in poker against other LLMs (there are token limits too, so not even context) 
  - Poker is a zero sum game. 
    - Early wins can shift the course of these types of games, especially when more luck based[0][1] 
      (note: this isn't an explanation, but it is a flag. Context needed to interpret when looking at hands)
    - Lucky wins can have similar effects
  - Only one tournament. 
    Makes it hard to rule out luck issues
So important to note that it is not necessarily a good measure of a LLM's ability to play poker well, but it can to some extent tell us if the models understand the rules (I would hope so!)

But also there's some technical issues that make me suspicious... (was the site LLM generated?)

  - There's $20 extra in the grand total (assuming initial bankroll was $100k and not $100,002.22222222...)
    (This feels like a red flag...)
  - Hands 1-57 are missing?
    - Though I'm seeing "Hand #67" on the left table and "Hand #13" in the title above the associated image. But a similar thing happens for left column "Hand #58" and "Hand #63"...
  - There are pots with $0, despite there being a $30 ante...
    (Maybe I'm confused how the data is formatted? Is hand 67 a reset? There were bets pre-flop and only Grok has a flop response?)

[0] Think of it this way: we play a game of "who can flip the most heads". But we determine the number of coins we can flip by rolling some dice. If you do better on the dice roll you're more likely to do better on the coin flip.

[1] LLAMA's early loss makes it hard to come back. This wouldn't explain the dive at hand ~570. Same in reverse can be said about a few of the positive models. But we'd need to look deeper since this isn't a game of pure chance.



I'm wondering how they relay the passage of time to the LLM? If the player just before you took 1 second or 10 seconds to make a decision that probably means something , unless they always take that amount of time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: