Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

De-duplication is already a "big deal" in lots of hosting situations, such as box.net, dropbox, etc. I doubt it's outside of the skillset of engineers at google to address the problem. Especially given that google already has the technology to do image searches based on other images.


De-duplication does not allow Google to know

    http://marketing.example.com/tracking/some_user.png
in cache is identical to

    http://marketing.example.com/tracking/some_other_user.png
without passing the second URL — a.k.a. the tracking data — to marketing.example.com.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: