Do we have enough info for to say that decisively?
Ideally we would see the training data, though its probably reasonable to assume a random collection of internet content includes racist imagery. My understanding, though, is that the algorithm and the model of data learned is still a black box that people can't parse and understand.
How would we know for sure racist output is due to the racist input, rather than a side effect of some part of the training or querying algorithms?
Ideally we would see the training data, though its probably reasonable to assume a random collection of internet content includes racist imagery. My understanding, though, is that the algorithm and the model of data learned is still a black box that people can't parse and understand.
How would we know for sure racist output is due to the racist input, rather than a side effect of some part of the training or querying algorithms?