Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's the security story here?


What's preventing GPU providers from sending wrong results instead of actually running the computation? For example, send the last computed result? Is this something that the renter has to handle by adding their own checks?

In addition to the problem of the renter crashing your machine or reading your password through DMA, of course.


What incentive would a GPU provider have to spend time figuring out what result to send for some custom application?


The incentive is huge, if I spend 2 milliseconds sending you your previous results instead of 2 hours running your new computation, I can (pretend to) run way more computations on the same hardware and collect hundreds of time more money.


At the risk of being exiled off the platform and earning nothing. Don't forget, there is a bit of KYC with Stripe.


ID verification before you can host and random audits from gpudeploy.


NO. That's the worst way to do almost anything on the Internet, and should be considered a last-line defense, if nothing else can be done. Here, it can be. See my comment above.


That's my whole question, do they do random audits, or is it the job of customers to double-check their results for possible attack or compute-theft and report.


It seems wrong to call it a "job of customers". It's like you wrote a Bitcoin client which didn't verified hashes of transactions, "trusting" everything. Or like serving a website with login feature supporting only HTTP, not HTTPS. It is a very basic feature of whatever software would connect to such services.


So it is the job of the customer to write their own Bitcoin or HTTPS client, in your metaphor.


Every technology was (very) underdeveloped at some early point in its evolution.


That's what I asked, how developed is it now. Why so defensive?


I don't know how developed is it now, I'm not associated with the startup shown in any way. It's mainly a question to them. However, in terms of wider industry, in general distributed high-performance GPU(-like) computing "for everyone" is in its infancy. 99% of what was already done up to this point was targeted to people who would both buy and supply power "in bulk", not "in retail". Perhaps with a little exception of several excellent projects like Folding@Home and other @home's.


Run 1/10,000 - 1/100,000 of computations locally, and also send them as tasks to be send remotely. If compare yields difference, repeat both. After, say, 10 tries, blacklist the provider. Of course it will take a lot more nuances to implement that, but that's the general idea. It's a no-brainer.


Sounds like a lot of work that I would expect the paid service to help with.


Yeah, it's "they" who should do that, of course.


Linux supports IOMMU on most platforms.


I fail to see how this relates. If you can't trust the provider, why does it matter whether they say they have IOMMU or not.


It relates to the "get your passwords over DMA part".


At the moment, we manually verify operators and are currently onboarding some tier-4 operators. Down the line, we'll have a 2-tier system where you can choose whether you want a verified machine or not. From the operator's perspective, everything runs inside Docker, configured with security best-practices.


I've always understood that containers are not proper sandboxes and shouldn't be used for containing untrusted code, no matter the best practices used. Has this changed in recent years? Do you have documentation for what sorts of best practices you're using and why they are sufficient for executing untrusted code?


You are correct from my knowledge. I would expect that if the container is set to not run as root you might be able to enforce fine meaningful security but I’d still run it in a VM if feasible.


Having done a little bit of work in the area[1], I think you should publicly document exactly what those best-practices are. Are the workloads running in a networkless container? Do you limit IO? Do you limit disk usage? Answering these in detail would help you gain customer trust on both sides.

[1]: https://containerssh.io/v0.5/reference/docker/#securing-dock...


So you don't have real security for operators, is what you're saying.

Containers are not, and will never be, a secure isolation boundary.


probably very basic... so don't run it on anything that has your own data on it (if you're an AI startup, definitely don't run it on your research cluster).


> definitely don't run it on your research cluster

...what‘s the threat, actually? GPU time sellers stealing your secret sauce?


I think they mean don't lease out your research team's GPUs and allow random people to run untrusted code on your cluster, lest they figure out a way to break out of any sandboxing the software has in place and get loose in your network. The company's current answer to that concern is "everything runs inside Docker, configured with security best-practices", which is less than inspiring.

https://news.ycombinator.com/item?id=40261591




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: