Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Are you associated with deepfloyd?

No, I’m not affiliated with StabilityAI

> It’s not a major leap not even a small one because it’s exactly like imagen.

I would agree, if imagen was a “consumer-available t2i model”. What’s available is a research paper with demo images from Google. The model itself is locked up inside Google, notionally because they haven’t solved filtering issues with it.

> Look at the afghan girl example again it. Close up portrait, same clothing, same comp, expressive eyes…

You look at it again, literally none of those things are the same: its not the same clothing (the material and color of the head scarf is different, the headscarf is the only visible clothing in the DF image, whereas that is not the case in the famous image), the condition of the head scarf is different, the hair color is different, the hair style is different, the hair texture is different, the face shape is different, the individual facial features are different, the eye color is much more brown in the DF image, the facial expression is different, the DF image has lipstick and eyeshadow, the famous image has a dirty face and no makeup, the headscarf is worn differently in the two images, the background is different, the lighting is different, and the faces are framed differently.

The similarities are (1) its a close up portrait, (2) a general ethnic similarity, and (3) they are both wearing a red (though very different red) head scarf, (4) and they are both looking straight into the camera. (2)-(4) are explicitly prompted, (1) is strongly implied in the prompt addressing nothing that isn’t related to the face/head. This isn’t “overfitting on a copyright image” its getting what you prompt, with no other similarity to the existing image.

> You guys all want it to be something special and I get it,

I’m actually kind of annoyed, because I’ve been collecting tooling, checkpoints, and other support for, and spending quite a bit of time getting proficient in dealing with the quirks of, Stable Diffusion. But, that’s life.

> it’s neither a good architecture nor a good implementation.

I’d be interested in hearing your specific criticism of the architecture and implementation, but hopefully its more grounded in fact than your criticism of the one image...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: