Hacker Newsnew | past | comments | ask | show | jobs | submit | hermesheet's commentslogin

Lots of great details in the blog: https://ai.meta.com/blog/meta-llama-3/

Looks like there's a 400B version coming up that will be much better than GPT-4 and Claude Opus too. Decentralization and OSS for the win!


Comparing to the numbers here https://www.anthropic.com/news/claude-3-family the ones of Llama 400B seem slightly lower, but of course it's just a checkpoint that they benchmarked and they are still training further.


Indeed. But if GPT-4 is actually 1.76T as rumored, an open-weight 400B is quite the achievement even if it's only just competitive.


The rumor is that it's a mixture of experts model, which can't be compared directly on parameter count like this because most weights are unused by most inference passes. (So, it's possible that 400B non-MoE is the same approximate "strength" as 1.8T MoE in general.)


It absolutely does not say that. It in fact provides benchmarks that show it under performing them.

Not great to blindly trust benchmarks, but there are no claims it will outperform GPT-4 or Opus.

It was a checkpoint, so it's POSSIBLE it COULD outperform.


Where does it say much better than gpt4 for the 400B model?


It doesn't ....


Is it decentralized? You can run it multiple places I guess, but it’s only available from one place.

And it’s not open source.


It's not open source or decentralized.


that's very exciting. are you quoting same benchmark comparisons?


The blog did not state what you said, sorry I’ll have to downvote your comment


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: