If I take a photo with a cryptographically signed camera of a polaroid containing an image that i deep faked, how does that ensure the image can be trusted?
Those depth estimation algorithms can't be used to distinguish a photo of a photo from just a photo. They will report false depth in a photograph of a flat photograph.
Yes, that's my point. You can't rely on the depth map in the image metadata to be the differentiator because it can easily be faked with depth estimation.