Hacker Newsnew | past | comments | ask | show | jobs | submit | smsx's commentslogin

Are the numbers in the H100 PCIE vs SXM table swapped for rows 3 onwards? It looks to me like the PCI is showing higher GiB/s numbers, which is counter to expectations. Or am I misunderstanding those benchmarks?


You're not misunderstanding, the PCIe does indeed outperform on the memory bandwidth tests. But it gets dominated on FLOP/s and real-world application bencharks.


I don't think first class Linear or Notion support will be high on their list given who acquired them.


Chris Olah has never published a paper? ... https://scholar.google.com/citations?user=6dskOSUAAAAJ&hl=en...


it's being actively worked on!


my comment was based on the last PR update I read some months ago where the developer had given up on merging it, so this is great to hear!


The numbers are pretty incredible. Will the competition be able to match them?


Groq is claiming 284 tokens/second on Llama 3.1 70b, so they’re in the same ballpark.

https://groq.com/12-hours-later-groq-is-running-llama-3-inst...


If Groq 2 is 2x faster it will match Cerebras WSE-3.


Wow, incredibly fast. Can easily replace my Groq usage if they can keep up with capacity demands.


Yup, check out Rho-1 by microsoft research.


Sounded like he he was talking about a public blog post right?



The point of this post isn’t the linear transformer algorithm. They’re surveying a variety of Linear transformers and showing a general form in order to talk at large about their performance characteristics.


They didn't say they joined 11 years ago, but that they were 11 years into their career.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: