You are underselling or not understanding the breakthrough. They trained 600B mo...

sho_hn · 2025-01-29T16:28:37 1738168117

Let's call it underselling. :-) Mostly because I'm not sure anyone's independently done the math and we just have a single statement from the CEO. I do appreciate the algorithmic improvements, and the excellent attention-to-performance-in-detail stuff in their implementation (careful treatment of precision, etc.), making the H800s useful, etc. I agree there's a lot there.