Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
mips_avatar
70 days ago
|
parent
|
context
|
favorite
| on:
Cloudflare's AI Platform: an inference layer desig...
Maxes out around 4K tok/s output. Each pair of 3090s has its own instance of the model with parallelism across the nvlink bridge. Though nvlink is only 2x over pcie5
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: