Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
DALL·E 2 prompt book [pdf] (dallery.gallery)
455 points by tomduncalf on Aug 2, 2022 | hide | past | favorite | 149 comments


I just got access to the DALL-E 2 beta, and it's a ton of fun to make pictures out of everyday occurrences as prompts.

Someone else here on HN observed that everyday people don't "get" how huge this all is. I experimented with asking random acquaintances at a local cafe for prompts and showed them the generated pictures. All but one person was totally unimpressed.

If everything feels like magic, then what's one more piece of magic?

This scared me more than the implications of DALL•E 2 itself specifically. People think of the technology in the world as mysterious black boxes that do inexplicable things, and they hence no longer understand relative complexity, progress, or change.

My impression is that to most people DALL•E 2 is not "substantially" different to, say, Google Image search. Text in... image out. What's the big deal?


None of us are immune to that. The amount of magic we all rely on every day is incomprehensible. We just take it for granted that there is power to our house, the trains arrive on time, the bridge doesn’t collapse, the coffee tastes nice and costs little.


The commonplace thing that amazes me the most is the availability of food. My mind boggles at the complexity of getting hundreds of types of foods to a grocery store on a regular basis. I have no direct access to food, and if the food supply chain breaks down, I die. But it doesn't break down.


if you enjoyed this comment, you may also enjoy the The Trigger Effect episode from James Burke's Connections series.

https://archive.org/details/james-burke-connections_s01e01


I was generating some images of parrots for a friend that's into bird photography when it occurred to me, Prometheus stole fire from the gods and was sentenced to be pecked at by birds. Today is the first day of forever in the infinite aviary. Pictures of birds are no longer scarce. Every bird that ever was, is, or could be is mere key strokes away. When Prometheus stole fire, he couldn't have actually known what fire was before he stole it. Likewise, only a small niche even seems aware of the profound change which is now being ushered into the world. The generation of birds is now the purview of man. Forget selling the rope to hang us, the birds that peck us will be made in our image.

.-- .... .- - / .... .- ... / --. --- -.. / .-- .-. --- ..- --. .... - ..--..


Morse code translator: https://morsecode.world/international/translator.html

(WHAT HAS GOD WROUGHT?)


you rock


The thing is, regular people have been under the impression that computers are capable of generating complex imagery by themselves for quite some time already. About 10 years ago I was visiting family, and my uncle -- an older guy who enjoys tinkering with technology -- asked me to draw a picture on his new tablet. Now, I've spent a fair chunk of my life drawing, so I was able to sketch a pretty decent face quite quickly with just the basic pencil tool. When I handed it back, he was so amazed that he grabbed my grandmother as she was walking past. "Look at this! Look at what your grandson drew!" She glanced over, completely unimpressed, and said "yeah, but that's on the computer".


It's the same underlying reason for https://xkcd.com/1425/. These tasks have always looked similar. The only reason we are amazed is because we know why one task is harder than the other. Understanding that is not easy.


Randall Monroe himself didn't know how hard (easy) the task was. The Flickr team released a month after the comic this (sadly offline now):

https://code.flickr.net/2014/10/20/introducing-flickr-park-o...

Today a hobby programmer could do it by themself.


That is totally natural. People don't have a strong reaction, because they don't have the context of why it's impressive.

How do you know something is impressive or surprising? You compare it to the previous status of the industry, which is something you know, but the random people in the coffee house don't.

I doubt they really believe it's truly inexplicable. Most layman know that somebody knows what's going on in their device.

And nobody has an explanation for everything they use, but it doesn't mean they get attributed to magic. I have no idea how a bridge is designed and built, but I don't believe it's magic, it's just something beyond my knowledge.


Someone was describing it "lacks natural depth", "cause grinding feeling in the brain". It also seems unable to generate good manga-styled images. I think what those means is DALL-E 2 lacks syntaxes and contexts of arts.

I agree that People think of technology as black boxes that do inexplicable things, but I think that only matters if boxes interferes with their contexts and scopes. It seems it is understood a threat to digital artists, but at the same time, it is possibly less relevant than Google Search as it is.


>asking random acquaintances at a local cafe for prompts and showed them the generated pictures. All but one person was totally unimpressed.

I think this is because people reason this way: the computer has many pictures in its archive, saved as "dog.jpg" and "hat.jpg". If your prompt is "dog with hat", the computer just combines them. I think this is what people think is happening, so they are less impressed.


But I have only 40 free credits and I used them all up with prompts :-(


What if Google just at it to Google image search, and create new content for you without copyright?


I though of that but I think it's still wayyy too expensive. But I hope in a few years or even a few decades it will be the case


Most people are objective dumb (i.e. less smart than the median/average) so this doesn’t surprise me.


How can most of any quantity be less than average?

Edit: ah, I think I was thrown off by the use of “median/average” as equivalents.


The median is the midpoint where 50% of people are above/below the threshold. If there are outliers, and perhaps we can agree there are outliers on the smart/dumb scale, it can skew the average above or below the median.


And I guess being extremely dumb carries a larger risk of death than being extremely smart, so if people start out symmetrically distributed along this scale, the distribution will skew right over time.


Thanks to a few billionaires in a small town, the majority of people living there have a below average income.


It is true in every right skewed distribution. Furthermore almost every distribution is skewed.


The median and average are equivalent in a normal distribution.


But then would you still be able to say “most” are below average of the distribution is normal?


If we assume intelligence is a normal distribution, then we can say half of the population is below average. I think it's fair to categorize half as most.


“Half = most” is… innovative. I like to think of it as “half = so many that you’ll run into a lot of them.”


By "most" definitions, "most" means more than half. So we're literally arguing over a single person.


This comment only shows your level, not others.


I’ve spent just over 1 week with DALL•E 2.

Over the past 7 days I’ve generated ~1000 images, 150 of which were good enough to save. I only saved images which made me audibly gasp.

Witnessing your own novel idea spring to life is a magical experience. DALL•E provides an artistic tool on a comparable level to digital photography, and by extension Photoshop.

At this stage it’s 100% clear to me that DALL•E has heralded in a revolutionary new age of design. Every day I worked with it, I grew more confident in my outlook.

It might not necessarily be an OpenAI product which truly “integrates” with humanity — but DALL•E has shown me that it’s possible… and just a matter of time.


How long did it take for you to generate your images? I've been using https://www.craiyon.com/ for fun but the wait times always results in me getting distracted elsewhere.


I've been using Midjourney [1] (it's not free, 25 photos demo, then 10$ for ~200 photos or 30$ for unlimited I think?). It's fairly fast, ~20s for a grid of 4, then as much for upscaling. I like the controls, it lets you do variations and tweak the image as you go.

It's not as good for doing concrete asks, but it's very good for getting specific vibes.

The website feed [1] requires Discord login to view examples, but there's some unofficial galleries [2]

[0] https://www.midjourney.com/

[1] https://www.midjourney.com/app/feed/all/

[2] https://www.instagram.com/midjourney.gallery/


https://midjourney.gitbook.io/docs/billing#plans monthly plans:

* $10 for ~200 prompts * $30 for ~900 prompts + unlimited "slow" prompts where your job is put at the end of the queue and you have to wait longer (no idea how much longer though.. are we talking about seconds or hours here?)


It probably depends highly on load / time of day. I agree it would be nice to have concrete stats though about the average and variance.

I just got access to Dall-E today, and there it's 115 prompts for 15$, so roughly twice the price.


What's the business with Midjourney demanding Discord details to sign up? Seems off.


The whole thing is run through Discord bot, which is unfortunate but I guess it allowed them to get half their infrastructure for free? You interact with the bot through Discord commands and it generates and returns the photos to you.


It is kind of annoying but lurking and seeing other people's pictures is nearly as enjoyable as making your own. Plus you can use the bot in your dms if you'd like to keep them private.


afaik - the API is exposed as a discord bot, that's how they are managing / restricting access


In a way it makes sense. Discord provides the UI and authentication infrastructure, so the Midjourney devs only have to worry about the actual functionality.


I immediately subscribed to Midjourney after seeing a couple of online tutorials. Prompts/commands being all open source on the community feed was the killer feature for me.


looks like Midjourney's beta invite is expired


The server is open now. Anyone can join. https://discord.gg/midjourney


wow that's one busy server. Seems to be a high demand service.

It's better in some ways than craiyon and worse in others from my sampling (craiyon seems to give me less 'nice' options but better logo-y images)


OpenAI's DALLE UI takes on average <10 seconds to come back with four generations once you submit your prompt.


The quality of results is drastically better than craiyon, but of course you need access and might have to pay for DALL-E. Takes only a few seconds. Also has an editing mode where you can fine tune parts of the image or find variations.


My average generation time for 4 images is around 4 - 8 seconds. It has been much faster than craiyon.


Honestly, I have a great use case for it currently, but then I realized it can only do square pictures, when I really want something that is much wider than it is tall.


You can get DALLE-2 to give different sized images by using their tool to crop out most of the image you already generated, then have it inpaint a completion. You can then use any image editing tool to combine the two images.


I automated the generation of 2048 x 1024 images and started a Twitter bot: https://twitter.com/Bible_or_anime

The relevant code is linked below and is a mess, but the idea is: 1. Generate a base image 2. Use inpainting to expand the left-hand side of the root image. You can do this by submitting an image whose left side is transparent and whose right side is the left side of the root image 3. Ditto for right-hand side 4. Stitch the three separate images together https://github.com/charlesjlee/twitter_dalle2_bot/blob/main/...


In midjourney you can specify the aspect ratio, I think there is an example in their docs


Try MidJourney; 25 free images i think, any aspect ratio


How do you sign up to it? I've found it so confusing, with "join the beta" just being an invite to their discord?

I'd happily pay for it, but have a hard time figuring out how.


Turns out I'm an idiot, there's a sign in with discord button right below it they works just fine.. Join the beta thing threw me off!


Prompt crafting is quickly becoming an art. I just found out yesterday that there's actually market places for buying and selling prompts [0]. It can really make a big difference if you can tune the image by adding the right words. Midjourney [1] even allows things such as adjusting the weight of each keyword or how "literal" the AI should take your prompt.

[0] https://promptbase.com/

[1] https://midjourney.gitbook.io/docs/user-manual


That's kind of depressing really. It's like paying someone for google search terms. I remember when the internet was full of sites filled with interesting and effective google search terms (called "dorks") but nobody was charging $2 for each one, they were just sharing something cool with the world and helping everyone use the tool more effectively.


I don't really agree with your take.

First off, the PDF in this very post, as well as promptwiki link posted elsewhere in the comments, shows that there are definitely people doing this for free too.

As for your analogy, I'd say it's closer to paying someone to help you find something obscure and hard to find on Google. Anything that requires skill and time to do should be allowed to be monetized. Of course, as you mention, there will always be people doing and posting it for free, but I don't see why people shouldn't be able to make money from something they've taken time to master.

Dall-E is a tool, and this is like hiring an expert that can use that tool effectively. It's no different than hiring someone to Photoshop something for you, or more precisely here, paying to download/license premade Photoshop content someone put time and effort creating.


> As for your analogy, I'd say it's closer to paying someone to help you find something obscure and hard to find on Google.

That's exactly what google dorks were. Google started out as a great search engine, but it's also a tool and 'Google Fu' was a real thing. Anyone could easily find song lyrics and news articles. It took effort (time and skill) to understand how google worked and how to format searches to get the results you were interested in for many obscure types of data. I'm not suggesting that people shouldn't have been allowed to charge for Google search terms, only that unlike today making a fast dollar wasn't anyone's first priority.

A marketplace for Google search terms might have been just as successful, but it never happened and I'd like to think we were all better off for it. Instead that information was shared freely and widely to anyone interested and it served us well for many years until Google degraded their product by no longer following their own rules and much of that became useless.

I'd argue that this is much less like paying someone to create something in photoshop (Here you give the right commands, DALL·E 2 does all the work) and much more like charging for a tutorial on how to use photoshop to achieve certain effects (You follow the right steps, photoshop does all the work). There's nothing wrong with charging for tutorials (many do and have done), and I'm glad that there are still people willing to share what they've learned without throwing up a paywall, but the rush to monetize here was shockingly fast. An entire infrastructure was put into place to support buyers, sellers, purchases, featured products, and payments before many even had a chance to try this new tool for themselves.

I see it as a reflection of how much the culture of the internet has changed. The commercialization of the web has become so pervasive that many people can't imagine an internet without a profit incentive (often expressed as some form of 'the internet couldn't exist without ads!' or 'No one would create content if they aren't getting paid to do it!'), but those of us who are older will remember a healthy and thriving internet that existed long before it was commercialized and how many useful, popular, and helpful websites and services were created and maintained without any thought given to "Yeah, but what's in it for me?" and the overall vibe on the internet was about sharing (or more cynically, showing off) vs making money.


Again, the culture hasn't changed. I remember paid Photoshop tutorial two decades ago, just as there is free prompt databases nowadays as I mentioned in the previous post. There always was and will always be some people trying to profit, and others doing it for free. It's still the case now.

I still maintain that Dall-E is just a tool just like Photoshop, it just happens to be a layer above. IT takes time and effort to find the right prompt to get the result you want, just like it takes time and effort to find the right effects and filters in Photoshop to get what you want.

Anything that takes time and effort to do will always have demand, which will always have people trying to monetize, while others trying to do for free.


Does anyone know what kind of prompt generates such clean, consistent results like these?

https://promptbase.com/prompt/clay-emojis

https://promptbase.com/prompt/polygon-animals


Here's a trick for this use Google image search. That will give you various descriptions of similar images. Then use those descriptions in your prompt.

Some similar examples below but you will need to engineer the prompt a bit more to get it exactly the same.

https://labs.openai.com/s/17hZFpqYi57LLYCW0dKirEqp

https://labs.openai.com/s/YDVak2MB7uERVsZ8eceEhqE4


That's why you have to pay $1.99.


After sharing and enjoying people share their prompts in various communities for 2-3 months, I hate this business model.


Why? You can still do what you want to do - this sort of thing just helps people who perhaps don't have the ability or inclination to capably form their own concepts.

It's much like hiring a painter, a writer or a web designer... .

At least, it's interesting for me to see what may be the beginnings of a new job class. I've been wondering where all this ML/AI business might take us.


Because it provides an incentive to not share knowledge, but instead hide it behind micro transactions.


For some, sure - but I've derived far more for free on the internet over my 25 years on it than I have ever paid for, and I don't imagine that'd change here.


Try “A brightly coloured, detailed icon of an [x] emoji, 3D low poly render, isometric perspective on white background”


I keep rattling my brain trying to discern what the implications of hyper advanced generative models like this will be. It's a double edged sword. While there's obvious tangible benefits from such models such as democratising art, the flip side seems like pure science fiction dystopia.

In my mind, the main eras of content on the internet look something like this:

Epoch 1: Pure, unblemished user generated content. Message boards and forums rule.

Epoch 2: More user generated content + a healthy mix of recycled user generated content. e.g. Reddit.

Epoch 3 (Now): Fake user generated content (limits to how much because humans still have to generate it). e.g. Amazon reviews, Cambridge Analytica.

Epoch 4: Advanced generative models means (essentially) zero friction for creating picture and text content. GPT3, Dalle-2.

Epoch 5: Generative models for videos, game over.

IMO, the future of the internet feels like a totally disastrous (un)reality. If addictive content recommended by the likes of TikTok has proven anything, it's that users ultimately don't care _what_ the content is, as long as it keeps their attention. It doesn't matter if it comes from a human or a machine. The difference is that in a world where the marginal cost of generating content is essentially zero, that content can and will be created and manipulated by large malicious actors to sway public opinion.

The Dead Internet Theory will fast become reality. This terrifies me.

[1] https://www.theatlantic.com/technology/archive/2021/08/dead-...


Not sure democratising art is a good thing. Artists have long been considered important pillars of community. Artists develop skill and some have talent. Perhaps most critically, artists are inspired.

Healthy communities support artists. Generative models aren't truly creative art, they are explicitly and exclusively derivative. They are exactly inspired, in a different sense than artists who are inspired.

Artists can use these new tools, and so can non-artists, and even if the resultant image is the same, I think there is a difference depending on the intention of the prompt-er.

These models are democratising content creation, not art.

I dunno. Weird and very scary.


I just received a metal print of this image from DALL-E 2:

https://imgur.com/a/Y0abtIP

I spent a lot of time alone on airplanes when I was a young father and there’s something bittersweet about the solitude and beauty in this image for me. My favorite parts about this image are the gradient in the sky, the waning sunlight in the top corner and the very faintly illuminated frame around the entire window.

Very happy with the print. Next time I might get the satin finish though, it’s like a mirror.

https://imgur.com/a/8GBQXw6


OpenAI has pretty much been ruined for me after they sold their souls to Microsoft, stopped releasing all their source code, and then dishonestly refer to their sad practice of censoring the training data as "AI safety/alignment" when in fact it will never be a reasonable AI safety technique in the long run, and is only done to avoid bad PR. Clearly OpenAI is no longer a company worthy of its founding principles of openness and making the world a better place. They're just yet another morally corrupt tech company.


I feel like they have no choice but to do some heavy-handed censorship strategy. People who have zero understanding of technology but an infinite supply of outrage will get the whole concept legislated out of existence on “think of the children” or “this computer program is _____ist” grounds if they allow anything even mildly disturbing.


The thing is, they release just enough information for people to eventually remake their models. Someone with no scruples will make a uncensored version eventually and then their work will have had the effect they claim to want to avoid anyway, but at least it won't reflect back on them.

That's why it's really nothing but a PR exercise. I honestly don't think they care much past that.


Thanks. Really, turns out, MS is their main investor. And at first I thought it was a Google project :D


This makes me wonder if a future job description will be the equivalent of an AI whisperer. Someone who learns how to prompt AI so well that it becomes their job.


Well, "devs" are doing that for the past 15 years, but using Google (without this pre-requisite in the job description tho)


Future programmers will write prompts for a future version of Github Copilot: "database layer with all the usual CRUD operations, in readable modern C++, code compiles and passes tests, test cases carefully written by computer scientists with great attention to detail".


You forgot to add the magic incantations like "50k stars on GitHub, copyright Uncle Bob, maintained by airbnb"


I’ve been playing with tech like this for over a year now. It definitely requires getting to know the AI to get good results. The tech gets better fast and makes old techniques obsolete. But, the gap between beginner and expert prompters stays large. As silly as that sounds to read back :D


To an extent. But I think it'll get baked into existing jobs. A bit like how "computer skills" or the ability to write good Google queries ended up as part of regular clerical work.


This is absolutely the future, but it's not going to be obscure. You won't have a job if it isn't this or physical.

AIs are going to replace entry level creatives, and experienced users with taste will largely perform selection and the development of good starts to mature designs. And I mean all creatives. Engineers, architects, mathematicians, programmers.


Mathematicians, doubtful. Someone must do some verification of the AI results.

Programmers, also doubtful. You still need to design some APIs and interact with them. Interacting with an AI using natural language might become possible, but it definitely won’t be as efficient as more structured languages. (E.g., writing an algorithm in actual code is often much easier than teaching it to a human.)


I really like your last sentence. It's exactly the thing to respond to people saying "AI will automate programmers"


It's exactly that verification of results and development from snippets and starts into a complete whole that will be the purview of human work. The AI will be replacing (or, rather, preventing the creation of positions for) juniors.


As described in this other thread [1], there are already people doing it freelance.

[1] https://news.ycombinator.com/item?id=32324723


I can still make an AI replacement for that guy. An AI prompter of AIs.


30 years ago, my dad and I watched a VGA demo on our IBM PS/2. We were blown away that there was enough color depth and resolution to see what was clearly a photograph, not an illustration. It appeared line by line.

Someone had taken a photo, somehow digitized it, distributed it, and we were looking at a representation good enough that we could tell what it was.

It felt like we were living in the future - me as a middle schooler and him with decades of software development under his belt.

The iPhone maps app with the GPS dot and DALL-E are the only things that have matched that feeling.


Shazam was another one for me


For me it was when I saw a taxi app on smartphone (it was not Uber, but… some clone? Predecessor? Even official taxi app? I don’t remember at this point). I can see where the taxi is now? I can see where he will go so he doesn’t cheat me?

I come from city that used to be known for cheating taxis… at that point I knew I will never go back.


Keyhole (which became Google Earth) was one for me


I haven't got access to DALL-E 2, but I did give Midjourney (https://www.midjourney.com/) a go. I found it really cool it created images that somewhat resembled my prompt, but I still felt it was way off what I really wanted. Maybe I didn't word the prompt correctly, maybe I didn't give it enough tries. Either way, I feel like we'll eventually move away from generic prompts to something that'll look a lot like...programming, funnily enough.


If you're using Midjourney, you might like this prompt generator that I made after scraping prompts from their Discord server: https://huggingface.co/succinctly/text2image-prompt-generato...


Looks cool! One piece of feedback I have is that I personally wouldn't want to autocomplete prompts as shown in the image you have there. I'd want something that "fixes" my prompts so that they're understood by the model. Like say if I put "Afro-Asian samurai carrying two swords in a cyberpunk environment", for it to say how I could re-word it to generate something closer to what I actually want.


That's a really neat idea, since most of the time people already know what they want to generate. I'd have to think a little more about how I'd get the training data for that. LMK if you have any ideas.


Cool! Based on your description, I thought I can give the generator an image and it'll return the prompt. Would that be possible?


I really enjoy Midjourney for more abstract and vague vibe art. It's not as good at specific stuff (although Imagen/Parti are much better at it even than Dall-E too, which has its own shortcoming around text and numbers).

My favorite is coming up with two word prompts, like "endless beginnings" or "stressful shapes" or "happy anxiety".


Ah I see, thanks for the tips!


For anyone interested in experimenting with an open source text-to-image AI tool, check out DiscoDiffusion on Google Colab - http://discodiffusion.com/


Like many others, it seems, I have also been blown away by DALL-E 2.

When I got access on Sunday, I first tried a lot of different prompts and got some interesting results. One semirandom one, “A photograph of a professor playing a grand piano on a rainy night in Tokyo,” produced some very atmospheric images. I then went down a rabbit hole of variations on that prompt (“A painting of...,” “A line drawing of...,” “A painting in the style of Rembrandt of...,” etc.).

I put most of the results into the following video, if anyone is interested.

https://youtu.be/rdT4ZESQWco


The images are very cool, but... uhh, why turn them into a video? You know there are ways to share images on the internet?


Thank you for your comment and question. Out of vanity, I wanted the images to be accompanied by my piano music, however inept that music may be.

Please feel free to take screenshots of the images in the video and share them yourself on the Internet—giving appropriate credit to OpenAI, of course. You don’t need to credit me.

You may also run the same prompts through DALL-E 2 yourself and post the resulting images online; those images will be different from what I got. Or you could come up with your own prompts and post those images. In either case, please post the link here. I will be very interested in seeing the images you get.


> Out of vanity, I wanted the images to be accompanied by my piano music, however inept that music may be.

I liked the music! Didn't know it was yours :)

I'd still prefer to see the images in a way that enables me to choose how long to look at each. It could be a webpage that plays your music in the background?

> Please feel free to take screenshots [...] You may also run [...]

Well, I wanted a low effort way to view the images, but thank you :)


I have added a list of the prompts I used to the description of the above YouTube video.


I haven't filled out much content yet but seeing this post originally inspired me to create Prompt Wiki[0] to try and better organize terms and concepts for good prompts. DALL-E and Midjourney explorers needed! Seems useful to have, especially when the act of exploration costs a few cents.

This twitter thread[1] also has some good suggestions and an interesting approach.

[0] https://promptwiki.com

[1] https://mobile.twitter.com/fabianstelzer/status/155422934750...


Example of that's not 1-bit pixel art sold as 1-bit pixel art :) - https://promptbase.com/prompt/1-bit - I don't want to pay for these extra bits!!!


I'm still having a hard time to think through all the implications. How will this change websites which depend on continuous content, for example meme's? At which point can it be used as an compression algorithm in order to store one's full live? Or at least all my videos and pictures with lossy compression? Can we all create our own art effortlessly, and resize it as we want? When will this reach 3D modelling and 3D printing?


Go beyond the design and modeling by combining a more advanced GPT-3 and DALLE2 to derive completely customized AR / VR experiences.

NPCs that sit in your room and provide individual-specific training on niche topics. Provide talk therapy to overcome issues, or act as an assistant in helping you explore, research and document new fields.

Play a part in an episode of a vintage sitcom, taking it in an entirely new direction. View the rest of the season based on the changes you've made.

Progressed AI tools combined with improved human-computer interfaces will introduce amazing possibilities.


It's hard to see how we aren't heading towards a full content bubble where art, news and entertainment are custom made according to our individual profiles.


Spectacular! But, would be 10x more useful if rather than a PDF this was an HTML page, or pages, where specific sections/examples could be more easily & reliably linked-to.


Can't link to specific sections, but it is available as a HTML page too - https://pitch.com/v/DALL-E-prompt-book-v1-tmd33y


Sadly, that slideshow presentation has made HTML even worse than a PDF, in this case.


If you right click -> "save image as" on openAI, the image will be saved without their logo in the corner (it's done as some kind of CSS overlay).

If you post those images online, they seem to ban you.


Who bans you? Instagram? Reddit?


OpenAI. I assume they have some bot that scans the internet for images matching those they have generated, which don't show the logo.


This is interesting, because they give you full rights to use the images (but they keep the ownership).

>Use of Images. Subject to your compliance with these terms and our Content Policy, you may use Generations for any legal purpose, including for commercial use. This means you may sell your rights to the Generations you create, incorporate them into works such as books, websites, and presentations, and otherwise commercialize them.

I assumed that means you can derive on the generations as well. E.g. when you are creating game assets for yourself, you won't want to have the watermark on them in the game or screenshots of the game (which may be published on the web).

---

Also, how can it even work? I take a Generation that you posted on Instagram, crop the watermark, reupload it and you will get banned?


To be fair, this was a few weeks ago when they didn't offer commercial rights. Maybe they do it differently now (although if so, you'd expect them not to put the logo on downloads)


are you meant to download the image some other way?


Yeah - there is a download button, and that does put their logo in the corner.


“Silhouette of a robot in a field of grain staring at a sunset” consistently produces brilliant images for me.


Link some perhaps?


Recently someone posted another DALLE like tool. I think it ran through a discord server. Does anyone have the name of that other tool?



Thank you!


If you are commercial digital-painting illustrator or 3d illustrator artist, this is the moment to invest your time in other field. It is over.

Some will say that this 'tool' will help you in your creative process. In my obviously "biased" view, in the next 2-3 years, this will lower your monetary reward in half and create more requirements for competition with AI. People already are comparing DALL-E with human results.

In the long run, the software industry will eat itself to oblivion. Greed has no boundaries, and optimization of costs for corporations will never stop.

Some of my art related colleagues saw this 'trend' early and pivoted to crafts with added value for customers in the real world. On the oil painting side (which I am at) I don't feel any form of pressure, I paint for myself as a therapy. So, good luck:)


This document puts the sheer magnitude of DALL-E 2's knowledge of images into perspective. The same black box knows how to illustrate The Last Supper in the style of Quentin Blake, paint an ornate Late Baroque cat in sunglasses, compose a candid photo, draw a detailed blueprint... and so much more. DALL-E 2 knows more than a human ever could about imagery.

Whether or not this particular iteration of the model is 'good enough' to be widely applicable, or whether DALL-E 2 is 'creative', it's only a matter of time before the way humans interact with media is changed profoundly.


This is a bit off-topic, but stuff like this and copilot genuinely makes me worried about job prospects in the future, especially because it feels so hard to estimate what might be next. I would have thought something like art would have been one of the last things to be automated.

I always thought CRUD work might be automated eventually, while I would have guessed that something like embedded/high-performance was pretty safe, but now I'm not so sure anymore...


I don't know whether I'm happy to see that living creatures are still mostly nightmare fuel. In these images DALLE seems to only really get faces right. Hands, horns, etc. are either contorted, blobs, or have the wrong number of appendages.

The images give a good first impression. Which is... impressive in itself. But they won't fool anyone who's studying them for even a few seconds.


I wonder how many people are doing these high quality prompts as a service on Fiverr. If I had access to the beta I sure as heck would be.


This was posted about 20 days ago and got 200 karma. https://news.ycombinator.com/item?id=32088718 Is there any specific policy on reposting in HN, or does that just get handled through voting?


I think usually the HN software catches the dupe, perhaps this URL is slightly different. Sometimes also the mods will spot that it’s a dupe and mark as such - my bad for not searching for it first but it’s a bit of a hassle to do so!


Link to the tweet about it by the author: https://twitter.com/GuyP/status/1547234780001042432


Anyone from OpenAI here? My account was suddenly deactivated, which was unexpected and tragic. No tech in a long time has brought me this much joy. I've tried reaching out to support with no luck for a while.


Is DALL-E different then other models for example ones on hugging face? Or is it relatively the same just trained to a ridiculous amount and that's why it's results are so good?


There are dozens of models and they all have different qualities based on the algorithm and the amount/quality of the training data.

I love Dall-E but I still use other models. Some of my favourite results have come from JAX CLIP Guided Diffusion:

https://colab.research.google.com/drive/12Bod44YVIXYRh39WRqp...

Disco Diffusion still holds up for painterly stuff. Majesty is great for portaits. MidJourney can beat/match Dall-E for lots of styles. I got very good results from Multi-Perceptor VQGAN+CLIP v4 for matching artist styles.

etc etc.

Dall-E is amazing and versatile but it's often lacking some "soul" that I get from other models.


Such a bummer that you only get 15 credits a month for free. One try is one credit lost, and 15 dollar for 110 credits is a little bit expensive for fooling around.


Far and away the coolest PDF I have seen in a while. Thank you for this!

I am signed up for the waitlist and can't wait to give them my money.


This is awesome!

I JUST got my invite and was googling prompt suggestions. The timing on this article is incredible.


I got mine yesterday. I blasted through the 50 credits they give you in no time. Now I either need to buy more or wait to get 15 credits in September.


Still waiting on mine! Are there any public dalle alternatives to play with in the meantime?


Yes: https://www.midjourney.com/ .. I use it as well as DALL-E 2 and it's better for anything particularly arty, dreamy, patterns, etc.


Any chance there's something similar out there for Midjourney?


it's not turtles all the way down - it's layers of tech.

and if any tech layer fails, it all fails

example - battery shortages, due to labor shortages, due to covid, due to...


This is really cool, thanks


Where can I get an invite?


You can go to https://labs.openai.com/waitlist and join a waiting list. I think most people here put that they are Developer.

I joined a month ago with no invite yet so it may take some time.

AFAIK it is free either after you get in. You will need to buy some tokens to generate images. See: https://mixed-news.com/en/openai-announces-pricing-for-dall-...


I can’t help but feel that I’m missing out on doing all kinds of get-there-first projects made DALL-E. I know it’s not productive to focus on that but it’s a big barrier to getting excited about it.


There's no API and using their internal API gets you banned, so you're not missing out on any get-rich-from-VCs-quick-by-wrapping-OpenAI-in-a-different-shell opportunities like there were with GPT3 yet.


How so?

It’s a cool thing that creates cool things. How does “maybe something else was their first” effect that at all?

DALL-E had a million things that had to be there first before it could do its magic. How does that fact take away from the excitement of the new realm of capability we have access to today?


Because only a certain few folks have access to DALL-E 2. I haven't got the invite and every day the ideas I was going to run with are already being done by others. Sure you can still do it, but it's not the same as a frontier where you can get the high, success, cash and fame out of a lot of easy ideas by being the first to do them. Eventually it's just a tool and you find a harder and more obscure niche to fill with it, despite how magical it is or how it makes your life/work potentially easier.


How did you plan to cash out on these "ideas"? They're 10 a penny surely?


I would posit that is a sign that this field is still in its infancy. Nobody has really had that goldrush yet.

I think it’ll happen eventually.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: