The cloud is staggeringly more expensive than building your own workstation. Lik...

m463 · on Oct 18, 2019

You forget that using the cloud you can spin up 9 ladies to have a baby in a month, or 270 ladies to have a baby in a day.

GenerocUsername · on Oct 18, 2019

Why is that? Shouldn't competition drive cloud prices closer to ownership prices?

angry_octet · on Oct 18, 2019

The average corporate IT department is rubbish at doing computing. Easier and quicker to outsource than doing it in house. But nVidia sell GPU appliances (DGX) for these on-prem customers.

Basically, expertise is expensive and comes in $200k/year increments. You can use a lot of cloud GPU for that much money.

sseveran · on Oct 18, 2019

Its because NVIDIA requires Teslas in the cloud, which while more powerful than 2080 ti's, cost roughly 6-8x as much.

Analemma_ · on Oct 18, 2019

How does that work? If I buy a 2080 Ti and then decide to use it in an internet-hosted box that I rent out to other people, the first-sale doctrine should tell Nvidia to piss off if they have a problem with it.

angry_octet · on Oct 18, 2019

When you are operating at any kind of scale you need the Telsa packaging (airflow, connectors, power) and testing (running many in a system). The consumer desktop cards cause lots of trouble.

Also the consumer cards have artificially small memory sizes (e.g. 11GB max) which painfully constrains DL jobs.

No manufacturer is allowed to build Tesla-like cards. Theoretically AMD could crush the nVidia profit margins by releasing cheap data center boards, but their developer support is rubbish and they want that high margin cash too.

fit2rule · on Oct 18, 2019

.. what this means, is that Apple is competing against the Titan/NVIDIA hardware designers to make better compute power available at scale, and in a way which makes sense across consumer/pro boundaries.

In that context, the Mac Pro doesn't sound too bad a proposition. I say that as someone who recently built a dual-Titan/AMD Ryzen system, and while the pain of the build is almost gone away .. I do lust after that sexy Apple box, being plug 'n play and all ..

angry_octet · on Oct 19, 2019

Intel competes with nVidia on high end GPU hardware and so far (KL and KF) are doing an abysmal job of it, despite using their CPU control to lock nVidia out of the CPU bus.

Apple can't really compete except in mobile GPUs, which is all about power constraints set to single digits Watts, vs >100W for server. That isn't all bad for Apple, there is lots of inference to be done, but by refusing nVidia hardware they impose a developer hurdle.

I'm looking forward to people putting 2080s into the new Mac Pro and seeing how reliable it is.

zenography · on Oct 18, 2019

Nvidia wouldn't have prohibited people from using GeForce cards in data centers if people weren't do it.

angry_octet · on Oct 19, 2019

People do try, but it isn't the basis for a successful computing platform. Works for block chain (though AMD delivers more flops/$ for simple sha-ing grunt work) and specific workloads where you have time and expertise but not much cash.

As for nVidia banning it, if you've the cash you can buy cards, just becomes painful. nVidia were doing it more because the GPGPU demand was choking the supply to graphics customers, and later dumping obsolete (for crunching) cards on the market en masse, interfering with the mid-tier card sales channel.

vluft · on Oct 18, 2019

Because nvidia's driver licensing terms require you not to be using the consumer stuff in a datacenter.

lliamander · on Oct 18, 2019

What if you're running Linux with the nouveau driver?

sseveran · on Oct 18, 2019

It's irrelevant. You need the CUDA SDK and the CuDNN SDK both of which have license agreements.

Edit: You may also need the official Nvidia driver. I have never run anything with the OSS drivers.

lliamander · on Oct 18, 2019

Gotcha

ZeroCool2u · on Oct 18, 2019

I believe they get around this by applying the restrictions to the software licensing for the CUDA libraries you have to download separately when installing things like TensorFlow.

gbear605 · on Oct 18, 2019

There’s a high cost of entry (onboarding lots of customers, building data centers, writing the management software, etc.), so there isn’t much competition. The market just isn’t efficient in this case.

magashna · on Oct 18, 2019

I think it does for stuff like storage, but transactional and heavy data processing are still cheaper to do on-prem from what I've seen.

PaulHoule · on Oct 18, 2019

If you are doing serious training with on-demand instances you are doing it wrong.

If your training takes more than an hour, you really need to checkpoint it periodically. If you have periodic checkpoints you can use spot instances and pay 10% or less of what on-demand would cost.

CoolGuySteve · on Oct 18, 2019

1) Even with checkpoints and spot instances, the initial REPL development process needs to happen somewhere.

2) Where are you seeing 10%? More realistically spot instances are 25-30% of the cost. And if so fine, 6 weeks of machine time == dedicated workstation.

phendrenad2 · on Oct 18, 2019

Yeah, one workstation. If you need to build out 1,000x as many GPUs, the cloud becomes more economical. And that's where the money is, in large customers, not someone who needs 1 or 2 cloud GPUs.