this is super cool stuff, and would've been really interesting to apply to spark stuff I had to do for Amazon's search system! How is this different than something like using spark-rapids on AWS EMR with GPU-enabled EC2 instances? Are you building on top of that spark-rapids, or is this a more custom solution?
Good question -- it depends. For certain workloads, it might look exactly the same! For others, I found that the memory and VM constraints were creating large inefficiencies. Also, many teams simply don't want to manage that level of data infra: managing EMR, instance type optimization, spark optimization (now with GPU configs!), custom images, upgrades, etc.
We take care of that and make it as easy as pie... or so we hope! On top of that, we also deploy an external shuffle service, and deal with other plugins, connectors, etc.
I suppose it's similar to using Databricks Serverless SQL!
Another thing: we ran into an incompatible (i.e. non-accelerated) operation in one of our first real workloads, so we worked with our customer to speed up that workload even more with a small query optimization.
I wonder how difficult it would be to do this with something like an old A7RII - there seems to be a good amount of older Sony A-series cameras with some kind of display or motherboard issues on ebay, but otherwise perfectly functional sensors
I have been passively looking at doing something like you are suggesting, but then also turning it into more of a video camera. My main motivation is that it seems pretty much impossible to get a off the shelf sensor that has DPAF.
One big hurdle getting the datasheet for the sensors - they are usually quite quirky and even if you get the datasheet and all documentation you will still need support from the manufacturer to get everything working reliably.