Dear HN,
I recently tried to follow along some Huggingface tutorials and - WOW! - was that painful. Google Colab quickly started complaining about the 300 GB my datasets were using. I was also constantly out of memory and reviews said that even if I'd upgrade to $50 monthly, my chance of randomly being assigned a GPU with enough memory would be slim.
I also looked at AWS EC2 and GCE and got sticker shock. I'm certainly not going to pay $20+ per hour the entire time I try to figure out how stuff works. Then I found OVH and their $2 per hour V100 instances seemed much more reasonable. So I built a docker image to emulate what's pre-installed and pre-configured on Google Colab based on their example. [1] That worked OK, but having only 1GBit/s between the AI training node and their object storage turned out to be the bottleneck.
So I took the next logical step, built a small proxy with authentication and letsencrypt SSL [2] to secure things and then self-hosted the docker image on the Ubuntu PC in my basement. That one only has a consumer-grade Samsung SSD and a 1080 TI but it has zero IO delays. In the end, training a facebook Wav2Vec2 model on that 1080 TI is comparably fast to a V100. And since the machine is in my basement, there are no $$$ per hour rental fees. It's effectively free, except for electricity.
So now I have my own Google Colab clone which is accessible from anywhere and running in my basement. Compared to public clouds it's cheaper, more responsive, I have more control, and it's even similarly fast.
I feel like I found a cheat code for "the cloud" :)
To setup your own copy, see https://github.com/fxtentacle/letsencrypt-auth-proxy#example-private-google-colab-clone
[1] https://hub.docker.com/r/fxtentacle/ovh-colab-sagemaker-compatibility-mode
[2] https://github.com/fxtentacle/letsencrypt-auth-proxy
- my box "sm3llslik3s0ld3r" is a love hate passion project, but surpringly similar in hardware to yours!
I originally made "sm3lly" (my box) after screwing up and leaving a high priced azure gpu cloud instance running on an unrestricted pay-as-you-go billing scope, that was filtered from my view (separate az-ad).. This was for a almost a week before I caught it resulting in a $7k usd type 3 fun "surprise" hit to my personal r&d budget.
I suspect its a small niche group: those who are doing ai on our own desktop hardware though. I assume most ai persons don't have the systems expertise to do/manage this or they work for a company/school which grants them the resources and don't bring "work" home with them.
Fwiw I also bought all "sm3lly" hardware on taobao from shenzhen directly, so it's not identical to yours but I was picking up pre-loved bitcoin mining GPus on the very cheap side after chinas crypto purge.