De-duplication is already a "big deal" in lots of hosting situations, such as box.net, dropbox, etc. I doubt it's outside of the skillset of engineers at google to address the problem. Especially given that google already has the technology to do image searches based on other images.