Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Parallel processing is about network latency and bandwidth for the types of algorithms that don't divide into small computable/paralelizable bits easily. For those tasks, supercomputer buses are unmatched.


That makes some sense, so I'll just post my thanks in this one reply so I don't have to thank everyone individually.

Much appreciated.

It does lead me to one additional question - is the need for additional speed great enough to justify this? I don't know how much faster it would be and I tried Google and they are not even remotely helpful. I may just be using the wrong query phrases.


> is the need for additional speed great enough to justify this?

Yep. I did molecular dynamics simulations on the Titan supercomputer, and also tried some on Azure's cloud platform (using Infiniband). The results weren't even close.


When you say the results weren't even close, and if you have time and don't mind, could you share some numbers/elaborate on that?

My experience with HPC is fairly limited, compared to what I think you're discussing. In my case, it was things like blade servers which was a cost decision. We also didn't have the kind of connectivity and speeds that you have available today.

(I modeled traffic at rather deep/precise levels.)

So, if you have some experience numbers AND you have the free time, I'd love to learn more about the differences in the results? Were the benefits strictly time? If so, how much time are we talking about? If you had to personally pay the difference, which one would you select?

Thanks for giving me some of your time. I absolutely appreciate it.


The problem with molecular modeling/ protein folding algorithms is that each molecule interacts with every other molecule. So, you have an n^2 problem. Sure, hueristics get this way down to nlogn but what fundamentally slows it is not the growth in computation but the growth in the data being pushed around. Each node needs the data from every node's last time step. For these problems, doubling the nodes of a cluster computer might not even noticeable improve the speed of the algorithm. When I was helping people run these, they were looking at a few milliseconds of computation for a small number of molecules that took a few weeks of supercomputer time. So lots and lots of computing generations were/are needed before we get anywhere close to what they want to model.


seriously.. take a 1/2 second to think.

(compute-chunk-time * latency * nchunks * ncomms) / n-nodes

obviously this oversimplifies things, but generally as an approximation, there you go.

then merge this in with your cost/time equation, and make the call.


seriously.. take a 1/2 second to think.

Don't do this on HN please.


I wonder why they think I was actually wanting the oversimplified stuff? I thought I'd made it clear that I wanted the technical details from their experience. Ah well... Your response is better than mine would have been.


And Azure's MPI/IB support is the best of all the cloud providers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: