I find it hard to see why they're using computer vision for cropping at all. Years ago there was a post on Reddit which showed how Reddit crops images for thumbnails; as far as I remember, it was to do with colour density, not facial recognition, and written with a simple image library in Python. With some images, you could predict what the thumbnail would contain. Perhaps that also had a racial bias, but at least it wouldn't be contributing to a long list of sociological criticisms of machine learning.