the real question here is whether anybody has gotten cheap, easily available AMD...

anthonix1 · on July 15, 2024

I ported Karparthy's llm.c repo to AMD devices [1], and have trained GPT2 from scratch with 10B tokens of fineweb-edu on a 4x 7900XTX machine in just a few hours (about $2 worth of electricity) [2].

I've also trained the larger GPT2-XL model from scratch on bigger CDNA machines.

Works fine.

[1] https://github.com/anthonix/llm.c [2] https://x.com/zealandic1

JonChesterfield · on July 15, 2024

Microsoft have their production models running on amdgpu. I doubt it was easy but it's pretty compelling as an existence proof