Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tell HN: I run a Google Colab clone in my Basement, and it's great for AI :)
29 points by fxtentacle on April 4, 2022 | hide | past | favorite | 7 comments
Dear HN,

I recently tried to follow along some Huggingface tutorials and - WOW! - was that painful. Google Colab quickly started complaining about the 300 GB my datasets were using. I was also constantly out of memory and reviews said that even if I'd upgrade to $50 monthly, my chance of randomly being assigned a GPU with enough memory would be slim.

I also looked at AWS EC2 and GCE and got sticker shock. I'm certainly not going to pay $20+ per hour the entire time I try to figure out how stuff works. Then I found OVH and their $2 per hour V100 instances seemed much more reasonable. So I built a docker image to emulate what's pre-installed and pre-configured on Google Colab based on their example. [1] That worked OK, but having only 1GBit/s between the AI training node and their object storage turned out to be the bottleneck.

So I took the next logical step, built a small proxy with authentication and letsencrypt SSL [2] to secure things and then self-hosted the docker image on the Ubuntu PC in my basement. That one only has a consumer-grade Samsung SSD and a 1080 TI but it has zero IO delays. In the end, training a facebook Wav2Vec2 model on that 1080 TI is comparably fast to a V100. And since the machine is in my basement, there are no $$$ per hour rental fees. It's effectively free, except for electricity.

So now I have my own Google Colab clone which is accessible from anywhere and running in my basement. Compared to public clouds it's cheaper, more responsive, I have more control, and it's even similarly fast.

I feel like I found a cheat code for "the cloud" :)

To setup your own copy, see https://github.com/fxtentacle/letsencrypt-auth-proxy#example-private-google-colab-clone

[1] https://hub.docker.com/r/fxtentacle/ovh-colab-sagemaker-compatibility-mode [2] https://github.com/fxtentacle/letsencrypt-auth-proxy



As an autodidactic, I use a similar approach, curious who else is similar?

- my box "sm3llslik3s0ld3r" is a love hate passion project, but surpringly similar in hardware to yours!

I originally made "sm3lly" (my box) after screwing up and leaving a high priced azure gpu cloud instance running on an unrestricted pay-as-you-go billing scope, that was filtered from my view (separate az-ad).. This was for a almost a week before I caught it resulting in a $7k usd type 3 fun "surprise" hit to my personal r&d budget.

I suspect its a small niche group: those who are doing ai on our own desktop hardware though. I assume most ai persons don't have the systems expertise to do/manage this or they work for a company/school which grants them the resources and don't bring "work" home with them.

Fwiw I also bought all "sm3lly" hardware on taobao from shenzhen directly, so it's not identical to yours but I was picking up pre-loved bitcoin mining GPus on the very cheap side after chinas crypto purge.


I'm currently waiting for the Ethereum vote to flood the market with used GPUs and then hopefully new ones will become more affordable, too.

And yes, I've also been burned multiple times by runaway AWS costs. Especially since you keep paying for CPU time for instances that are shut down but not fully deleted. That always struck me as odd and most other cloud providers handle that differently.

And yes, I would also assume that most AI persons don't have the systems expertise, which is kinda sad. But then again, most AI tutorials are buggy as hell, presumably for the same reason. For example https://huggingface.co/blog/wav2vec2-with-ngram

(yes, the people building the Huggingface AI toolkit) says:

"The 5-gram correctly includes a "Unknown" or <unk>, as well as a begin-of-sentence, <s> token, but no end-of-sentence, </s> token. This sadly has to be corrected currently after the build."

And then follows a cringe-worthy script to "fix" the missing </s> by rewriting a 100+ GB file in python. But actually, they are just using KenLM the wrong way. Once you put each sentence onto its own line - like the manual says - it produces the </s> just fine. And now I wonder if I still want to trust their python packages to do the right thing...


Re unexpected AWS costs: This happened to me too. I hate AWS for it. The best solution I've found is to set up a "Budget" in the AWS console that alerts me by email when my account spends more than I expect. Just last week this caught a case early that would have likely resulted in an unnecessary 4-figure bill.


I was going to suggest that you could post the code until I got to the bottom and you already had! Very cool project. I love cloud tooling but in some cases a local machine is something like 1000s of times cheaper.

Best to mix and match and you get outrageous amounts of compute and efficiency for relatively little money.

This reminds me of a similar project that might interest you.

GPT-J is an open source alternative to GPT-3 made by a former Googler. I've read it works suprisingly well. People claim close to GPT-3.

To run it people talk about using services like vast.ai. Don't get me wrong it's cool that exists, but I admit I've been tempted to build out a dedicated machine just to run this model in my basement. You need 24 GB of video memory to run it though so you're probably looking at around $4000 to build it (depending on graphics card prices).

But! For that price you could have your own unmetered GPT-3 knock off in your basement, thereafter also for the price of electricity.


Thanks for mentioning GPT-J :)

I didn't know about them before. That looks very usable and I'm truly astonished that someone was generous enough to pay for training such a large model and is now giving it away for free.

As for the memory requirements, I'm pretty sure one could split the model into different stages which can then execute one after another => You only need 8GB with 3 slices. I'm currently using something similar to squeeze a ~30GB wav2vec2 into a 12 GB RAM 1080 TI.


Oh wow - thanks for that tip I hadn't considered it. I actually have a lower end 3000 series card I'll give that a try. Thanks!


Very cool, stealing this.

Have a dual RTX A5000 in a dual Xeon with 512GB and 30TB of SSD RAID storage, backed up by another couple of hundred TB of spinning rust in my closet. Run Windows and Ubuntu in VMs that share the GPUs, and can spin up new VMs with shared GPUs on a whim. Self-hosted is the way to go if you either have the h/w on hand or can get over the initial buy-in price. I can access the machine remotely via a variety of means, and can even stream directly to my laptop.

Plus I get to play games on it! :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: