OP never said Claude made a whole game from scratch though, nor are they saying Claude is doing everything without any human contributing to the project, nor are they saying they haven't spent a lot of time and effort on it. Just that it's made it fun and more accessible and it's gotten them excited about something they abandoned.
Here's a bullet point list of the things Claude's done according to OP:
* it picked up the general path immediately
* he explicitly pushed into "lets have V0 game play loop finished, then we can compound and have fun = not giving up".
* [I gave him game design ideas,] he comes with working code.
* [I gave him papers about procedural algos,] and he comes with the implementation
* brainstorm[ed] items
* create[d] graphic assets
* he created a set of procedural 2d generators as external tools
I have a simple script system in my editor that is designed to let the chatbot (Claude) to work on the content. The script interface lets it to import assets into the project, open them for editing, take a screenshot, export content (and few other things). All data is in JSON so it typically figures out the data format quite fast and easily.
Here screenshots of some UI styles that it generated.
Bevy is a great engine for LLM-based games because it's 100% code. I'm toying with a few things in it, one of them is an entire-planet economic simulation, and it scales well up to a million dead tiles and 10k-50k live tiles on Apple Silicon, pretty impressive.
> And when you inevitably get bored with it, well, you've not done much anyway.
I'm very interested in Local LLMs but the cheapest Mac Studio right now is more expensive than 8 years of a Claude Code Pro subscription, and incomparably slower/less capable. If I get bored with it, I will have a piece of unused hardware and a couple grand less in my bank account.
I had a ton of fun setting up and trying it out locally (also opencode and one of the qwens.) I still don't have hardware powerful enough to feel like it's meaningfully productive, but all the learning I had to do (and all the bonus things I got curious about as the curtain peeled back) got my nerd brain all worked up, and finally seeing it work was exciting in that cool-new-experience way you don't often get to enjoy :)
Yeah this is exactly how I felt!
Never really felt excited about llms or agentic workflows before. Getting everything setup 100% local and tweaking it to exactly what I want and having it actually working quite well has been a really cool experience.
I did tinker a lil with mine! RTX3080 with 10GB VRAM, 5600x with 64GB DDR4 - not very good but it was very fun and exciting to tinker with :)
My partner on the otherhand has an M3 Max 64GB which I've had way more success with. Setting up opencode and doing a tiny spec-driven Rust project and watching it kiiinda work was extraordinarily exciting!
I admittedly haven't done a ton of research lately on AI capable PC hardware because of how nuts prices are right now, so I might be missing something...
...but all the AMD 395+ machines I can find are even more expensive than the aforementioned cheapest Mac Studio. Mac Studio starts at $2,000 (only 32GB), AMD 395+ 128GB machines seem to start at $3,000 from what I can see.
the QWEN-3.5-CODER-NEXT fits in half the 128GB and the rest for context. with the right plugins, particularly context pruning, ive got it running over night by writing plans then implementing.
i do not know if theres a smaller model with same capability, but model size and context window at 128 seems like a sweet spot.
token speed really isnt a bother because im either just multitasking or working on the filling in the missing details.
regardless, i think comparing first VRAM sizes w/target model then speed for your cost efficiency. plus, a healthy skepticism of mac hardware costs.
> there will be a reckoning when open weight models are good enough
Will you have the hardware to run them? Perhaps. Will enough of Anthropic's/OpenAI's large enterprise customers have the hardware to run them and the money/desire to have their own internal teams set up and maintain them?
> In terms of limits, I usually find myself hitting the rate limit after two or three requests.
I'd absolutely love to see exactly what you're doing (...well, maybe in a world where I had unlimited time or could clone myself...) because as tight as the usage limits are I absolutely cannot fathom hitting them THAT early.
What are the requests like, and have you noticed what is Claude doing during them? Is it reading an entire massive codebase or files that are thousands of lines long? Or are you loaded up with many MCPs or have an ever-growing CLAUDE.md?
I'm writing a compiler. When I have Claude write a new feature, I have validate that suite against a test suite of ~200 tiny programs.
I have a shell script that automates this. If all tests pass, the shell script prints "200/200 passing" with very little token spend. If only 190/200 pass, the shell script reports the names of every test that failed, and now Claude does a process of
1) run the compiler binary -> 2) get assembly output and inspect for obvious errors -> 3) assemble -> 4) verify that the assembler did not report errors -> 5) run test binary, connect with gdb, and find the issue -> 6) edit the compiler source -> 7) recompile the compiler -> 8) back to 1
multiplied by 10 for the 10 failing tests. This eats up tokens very quickly. I realize that not every use case is going to look like this. But if I didn't have Claude verify against the test suite, then I'd be getting regressions left and right, and then what's the point?
The whole codebase (tests included) is less than 15k lines, so I don't think that's the issue. No MCPs. CLAUDE.md about 1.5k lines.
The main critique of the handling of heat shields also happened at NASA in 2022-2024 and the project continued on. Artemis is largely a product of congress.
I was so mad when they removed the fourth option. I can't remember which one was which, but one meant "open in a webview inside this app" and the other was "open in a new tab in your default browser". It was still terrible UX but I liked at least having that choice.
Perhaps part of their portfolio is the code they've hand written, and part of it is demonstrating they are able to use this new tool to make something that works (despite how imperfect the tool is, as we see so many people point out)
Here's a bullet point list of the things Claude's done according to OP:
* it picked up the general path immediately
* he explicitly pushed into "lets have V0 game play loop finished, then we can compound and have fun = not giving up".
* [I gave him game design ideas,] he comes with working code.
* [I gave him papers about procedural algos,] and he comes with the implementation
* brainstorm[ed] items
* create[d] graphic assets
* he created a set of procedural 2d generators as external tools
* he even helped me build the lore.
Every one of these are plausible in isolation.
reply