I recall seeing some complaints recently w.r.t. one of the heavily synthetic models (Phi?) - apparently they tend to overfit on STEM "book knowledge" while struggling with fuzzier stuff and instruction following.
I'm not much of an LLM user, though, so take my warmed over recollections with a grain of salt.
I'm not much of an LLM user, though, so take my warmed over recollections with a grain of salt.