Developer also recommended (tongue-in-cheek) to use Microsoft's built-in encryption services (easily defeated) in his outgoing blogpost — perhaps because he was barred from explaining the real reason for project's cancelation.
What was the main focus when training this model? Besides the ELO score, it's looking like the models (31B / 26B-A4) are underperforming on some of the typical benchmarks by a wide margin. Do you believe there's an issue with the tests or the results are misleading (such as comparative models benchmaxxing)?
You can use this model for about 5 seconds and realize its reasoning is in a league well above any Qwen model, but instead people assume benchmarks that are openly getting used for training are still relevant.
Definitely have to use each model for your use case personally, many models can train to perform better on these tests but that might not transfer to your use case.
The main pain points for us were: thread-safety issues (httpx claims to be thread-safe but we hit race conditions in production), no HTTP/3 support, and the redirect behavior requiring explicit opt-in everywhere. Also the multiplexing story in httpx is quite limited compared to what niquests offers out of the box. On top of that, httpx maintenance has been slow to acknowledge valid bug reports, the thread-safety issue took over a year to even be acknowledged...
Wild it’s taken people this long to realize this. Also lean tickets / tasks with all needed context to complete the task, including needed references / docs, places to look in source, acceptance criteria, other stuff.
reply