Or even just a notepad. It's well established that long context histories with s...

daxfohl · 2025-07-13T02:40:53 1752374453

I went ahead and did this, just using a maze where the only actions are "turn left". "turn right", and "go forward". The response is either OK or BLOCKED, and additionally which directions are open and which are walled in the current cell (relative directions: front, back, left, right. Not north east south west).

When having it try to retain context of where it is via chat history alone, it failed. It went back into fully-explored paths and explored them again. When giving it a scratchpad to store knowledge, it didn't do any better. At least one common failure pattern I noticed is it got confused and updated its position when issuing a "turn" command. It never created a map, no matter how strongly I recommended doing so in the prompt, but generally just stored a list of cells it went to, and what it saw in each cell.

I'd have played with it more and think I could have eventually gotten it working 99% better, but I'd already spent $12 and I'm unemployed so.

Anyway, even if I got it working really really well, it'd still just be another example of "don't use an LLM when a simple state machine would be easier and work better". I think an LLM could be useful if the response from the maze game was natural language: "You went forward a step, to find yourself a T-intersection with passages off to each side.", and have the LLM translate that to a structured response that could be fed into the state machine. But it's still not reliable enough to serve as the state machine itself.