Hacker Newsnew | past | comments | ask | show | jobs | submit | roanakb's commentslogin

Unfortunately, SM efficiency is not accessible via nvidia-smi. The best methods to track it would be to:

1. Profile your model with Pytorch Profiler 2. Export metrics with Nvidia DCGM


oh this looks great, thank you for bringing this up! I'll have to give it a try, but seems like the FSDP limitation on torch.compile might carry over?


Yup, you'll see 100% utilization on a kernel over a time period if it's considered active, which includes just having a single thread executing [1]. SM occupancy is great but can be a little difficult to interpret since you're not simply trying to maximize it, unlike SM efficiency.

[1]: https://pytorch.org/blog/pytorch-profiler-1.9-released/#gpu-...


Nice, seems like ML Productivity Goodput is a pretty well thought-out metric to understand the overall efficiency of your cluster. I'll consider adding this into our cluster management platform. Only potential drawbacks I'd guess are it being somewhat difficult to compute since it relies on metrics like MFUs, and not something we can observe layer-by-layer to understand inefficient kernels, but I'll take a deeper look. Thanks!


Agreed, roofline plots would be quite powerful in this context. From a quick search, seems like the only way to create a roofline plot for your model would be to use Nsight [1]? Would be interested to know if there are any simpler tools, since one of the big benefits of SM efficiency is how easily the metric is accessed.

[1]: https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s...


Depending on the size of your application you can calculate flops by hand

https://docs.nersc.gov/tools/performance/roofline/


Yup, similar to SM efficiency in that sense too. If you aren't seeing >80%, there is certainly time left on the table. But getting a high SM efficiency value doesn't guarantee you're making good use of the hardware as well. (still a better proxy than GPU util though)



Looks great, you guys made it really easy to integrate!


thanks roanak!


Thanks! Let me know if there are any features you'd like to see added.


Looks really cool! Nice work.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: