I know what porn looks like. I know what children look like. I do not need to be shown child porn in order to recognize it if I saw it. I don't think there's an ethical dilemma here; there is no need if LLMs have the capabilities we're told to expect.
AI doesn't know what either porn or children are. It finds correlations between aspects of inputs and the labels porn and children. Even if you did develop an advanced enough AI that could develop a good enough idea of what porn and children are, how would you ever verify that it is indeed capable of recognizing child porn without plugging in samples for it to flag?
LLMs don't "know" anything. But as you say, they can identify correlations between content "porn" and a target image; between content labeled "children" and a target image. If a target image scores high in both, then it can flag child porn, all without being trained on CSAM.
But things correlated with porn != porn and things correlated with children != children. For example, in our training set, no porn contains children, so the presence of children would mean it's not porn. Likewise all images of children are clothed, so no clothes means it's not a child. You know it's ridiculous because you know things, the AI does not.
Nevermind the importance of context, such as distinguishing a partially clothed child playing on a beach from a partially clothed child in a sexual situation.
So it is able to correlate an image as porn and also correlate an image as containing children. Seems like it should be able to apply an AND operation to this result and identify new images that are not part of the data set.
No, it found elements in an image that it tends to find in images labelled porn in the training data. It finds elements in an image it tends to find in images labelled child in the training data. If the training data is not representative, then the statistical inference is meaningless. Images that are unlike any in the training set may not trigger either category if they are lacking the things the AI expects to find, which may be quite irrelevant to what humans care about.
AI doesn’t understand context either — it can’t tell the difference between an innocent photo of a baby in a bathtub with a parent, a telehealth photo, or something malicious. Google is using AI in addition to hashing, and both systems can get it wrong. With AI you’re always dealing with confidence levels, not certainty. No model in history has ever had 100% confidence on anything.
A scanning system will never be perfect. But there is a better approach: what the FTC now requires Pornhub to do. Before an image is uploaded, the platform scans it; if it’s flagged as CSAM, it simply never enters the system. Platforms can set a low confidence threshold and block the upload entirely. If that creates too many false positives, you add an appeals process.
The key difference here is that upload-scanning stops distribution before it starts.
What Google is doing is scanning private cloud storage after upload and then destroying accounts when their AI misfires. That doesn’t prevent distribution — it just creates collateral damage.
It also floods NCMEC with automated false reports. Millions of photos get flagged, but only a tiny fraction lead to actual prosecutions. The system as it exists today isn’t working for platforms, law enforcement, or innocent users caught in the blast radius.
"I know what porn looks like. I know what children look like."
Do you though?
Some children look like adults (17 vs 18, etc). Some adults, look younger than they actually are. How do we tell the difference between porn and art, such as nude scenes in movies, or even ancient sculptures? It doesn't seem like an agent would be able to make these determinations without a significant amount of training, and likely added context about any images it processes.
That is a good point. Is the image highly sexual? Are their children in the image?
Not a perfect CP detection system (might detect kids playing in a room with a rated R movie playing on a TV in the background), but it would be a good first attempt filter.
Of course, if you upload a lot of files to Google Drive and run a sanity check like this on the files, it is too late to save you from Google.
Avoiding putting anything with any risk potential on Google Drive seems like an important precaution regarding the growing tyranny of automated and irreversible judge & juries.
I've seen AI image generation models described as being able to combine multiple subjects into a novel (or novel enough) output e.g. "pineapple" and "skateboarding" becomes an image of a skateboarding pineapple. It doesn't seem like a reach to assume it can do what GP suggests.