I think its worth noting that if you are paying for electricity Local LLM is NOT free. In most cases you will find that Haiku is cheaper, faster, and better than anything that will run on your local machine.
This 35B-A3B model is 4-5x cheaper than Haiku though, suggesting it would still be cheaper to outsource inference to the cloud vs running locally in your example
When they bumped the context size up to 1m tokens they made it much easier to blow through session limits quickly unless you manually compact or keep sessions short.
Yeah compared to the case in LA today where one person was awarded 3M for getting addicted to instagram. The verdict here seems about 4 orders of magnitude too small.
> Tom pulled up the tool’s specification on his diagnostic display. This was always the first step: read the spec, not the code.
Clearly this writer has never felt the frustration of CC telling them a feature was never a part of the plan, because it overwrote the plan and then compacted.
Back in 1960 us early detection systems mistook the moon for a massive nuclear first strike with 99.9% certainty.
With a fully autonomous system the world would have burned.
You can play with the model for free in chat... but if $20 for a coding agent isn't effectively free for use case it might not be the right tool for you.
ETA: I've probably gotten 10k worth of junior dev time out of it this month.
reply