After generating 5000 images with these tools, I believe the killer app will be ...

gamegoblin · on Aug 28, 2022

Have you seen the img2img results? You draw kind of a crappy Microsoft Paint style image, give it some text for how you want it to actually look, and it does the transformation.

For example: https://www.reddit.com/r/StableDiffusion/comments/wwgge8/ano...

Consider also this example of someone splicing Stable Diffusion into a proper image editor and using a combination of img2img, text to image, inpainting, and normal photoshop tools: https://www.reddit.com/r/StableDiffusion/comments/wyduk1/sho...

fragmede · on Aug 29, 2022

Try out Img2Img easily at https://huggingface.co/spaces/huggingface/diffuse-the-rest

orbital-decay · on Aug 28, 2022

The natural language alone is one of the worst ways to control image generation. The model knows how to generate anything, but it's own "language" is nothing like yours. It's like writing in Finnish, twisting it in such a way that it would yield coherent Chinese poems after Google Translate. You will end up inserting various garbage into your input and not getting the result you like anyway. img2img gives much better result because you can explain your intent with higher order tools than just textual input alone.

What would be best is to properly integrate models like that into some painting software like Krita. Imagine a brush that only affects freckles, blue teapots, fingers, or sharp corners. (or any other thing in a prompt) Or a brush that learns your personal style and transfers it onto a rough sketch you make, speeding up the process. Many possibilities.

I think they are already making an img2img plugin for Photoshop. Watch the demo, it's kind of impressive. [0] It's just a rudimentary prototype of what's possible with a properly trained model, but it already looks like a drop-in replacement for photobashing (as an example).

https://old.reddit.com/r/StableDiffusion/comments/wyduk1/sho...

thom · on Aug 29, 2022

Someone claims to have made a Stable Diffusion PhotoShop plugin:

https://www.reddit.com/r/StableDiffusion/comments/wyduk1/sho...

Doesn’t match the workflow you’re describing exactly, but shows how this stuff can be integrated somewhat smoothly into a UI.

redox99 · on Aug 29, 2022

It's all about generation time. If generation was faster, the UI could preemptively show you a lot of variations based on suggested keywords. And also you could click things and get immediate results.

Currently it takes my mid end PC (2070 Super) 10 seconds per image, which is too slow. You would need to get generation time below 1 second to be quite productive. I guess you can already achieve that with something like triple 3090s?

spywaregorilla · on Aug 28, 2022

I think the ideal UX will be the ability to markup images with little comments and have it adapt accordingly. The prompt interface is bad. One of the biggest reasons being that you have virtually no control on the spatial aspect of your additions. Being able to say "add an elephant here and remove this lamp" will be big. Being able to do so with a doodle of an elephant to suggest posing will be even better.

adhesive_wombat · on Aug 28, 2022

Reminds me of the holodeck scene where Picard(?, Edit, Geordi) reconstructs a table with what I, at the time, thought was a pretty vague set of specifications.

Turns out the Star Trek predicted 2020's style AI behaviour rather well. Considering nuclear war is then due in 2026, that's disconcerting.

twoodfin · on Aug 28, 2022

You’re thinking of Riker, LaForge, Worf, and an (unnamed?) civilian as assisted by Troi in season 6’s “Schisms”.

I find the best holodeck “prompt” scene to be Picard explaining how he’d like to experience the world of Dixon Hill in “Manhunt”:

https://youtu.be/p7pPedBtbvk

goldenkey · on Aug 29, 2022

The best holodeck prompt was when Moriarty gets created with an intelligence that can rival Data and he takes control of the enterprise. :-)

https://youtu.be/msjQKkkW2Wo

adhesive_wombat · on Aug 29, 2022

An odd one, that. After all the lore (geddit?) about Data and his brother being unique and special for their unrivalled artificial intelligence, it turned out all you have to do to exceed that is just vaguely ask a standard-issue ship computer to do so.

goldenkey · on Aug 29, 2022

I think the size of the enterprise and its fusion reactor is quite an unfair advantage. Was Data really supposed to be smarter than the enterprise especially when it can read Data's mind state in order to fulfill the prompt?

adhesive_wombat · on Aug 29, 2022

I suppose the EMH is (or at least was, pre-mobile emitter from the future) a thin client for the Voyager computer.

Still seems odd that it's only apparently Data, Moriarty and the Doctor that have demonstrated the Federation actually can make pretty general AI with the tools it already has on starships (and conveniently always on the ship with all those film crews on it making the Historical Records).

Surely under the crust of some demon class planet there's a bank of millions of times that power bring used for...something.

There's probably a rule against making AI that you're allowed to break in the delta quadrant though.

twoodfin · on Aug 29, 2022

There’s no direct canon confirmation, but it seems quite plausible that it was, in fact, the Bynars who provided the technological leaps necessary for the Enterprise computer to generate Moriarty and other proto-sentient characters. Riker and Picard both comment on the realism and perception of Minuet, created by the Bynars on the holodeck after their upgrades.

And there is a direct canon line from Moriarty through to the EMH and later sentient holograms via Lt. Barclay.

ilaksh · on Aug 29, 2022

I make tools for artists and am afraid to incorporate AI generation because I am pretty sure then everyone will just discount work creates with my tool, assuming all of it was AI generated, and then no artists will want to use it.

What I am actually leaning towards is a tool for users to "enhance" art with AI, but only if the artist allows it.

brundolf · on Aug 28, 2022

More generally: continuity from one image to the next

If you want a set of images with the same artistic style for example, especially a distinctive one, that can be hard to do

If you want a set of images starring the same recognizable character or object, in eg. different situations, that's gonna be real hard to prompt

astrange · on Aug 29, 2022

You can do this in SD by fixing the random seed; only reason earlier models can’t is their UIs hide the seed from you.

More control is coming: https://dreambooth.github.io/

moffkalast · on Aug 28, 2022

> I believe the killer app will be the one that's available for free

FTFY

Dall-E only really became known after Dalle-E Mini was used to flood the internet with memes.