Its definitely a mix, we have been codeveloping better models and frameworks/sys...

Its definitely a mix, we have been codeveloping better models and frameworks/systems to improve the outputs. Now we have llms.txt, MCP servers, structured outputs, better context management systems and augemented retreival through file indexing, search, and documentation indexing.

But these raw models (which i test through direct api calls) are much better. The biggest change with regards to price was through mixture of experts which allowed keeping quality very similar and dropping compute 10x. (This is what allowed deepseek v3 to have similar quality to gpt-4o at such a lower price.)

This same tech has most likely been applied to these new models and now we have 1T-100T? parameter models with the same cost as 4o through mixture of experts. (this is what I'd guess at least)