I'm going to try to plead my case for images generated using sophisticated prompt engineering to be copyrightable. For example, at the point that I've written a prompt with 20 tags, 10 negative prompt tags, some loras, custom weights, embeddings merges, and prompt editing, I'm now writing what is effectively a "program", which should be copyrightable and so should its outputs.
It's total BS to me that a book of midjourney generated images is itself copyrightable because a human arranged the book together, but that a highly sophisticated prompt involving custom tooling wouldn't be.
If nothing else, my comments should show the US Patent Office how deep the rabbit-hole goes with just how interpolatable everything is with everything else.
Copyright is, like all regulation, freedom restriction for the benefit of society. Producing movies and books takes a lot of investment, and people would not do it to the extent they do without copyright. If we think there is more than enough creative content, we should reduce copyright protections, and if we think there is not enough, we should increase it. Generative AI will move the needle very far in the direction of overabundance and should result in correspondingly reduced copyright protections.
The thing you're authoring is just your prompt. Apply for a copyright for it. The image or text generated in reply is generated by a computer with no more input from you than someone commissioning work using very precise words, and lacks human authorship.
This falls apart when you start getting into things like control net and posing, or iterative erasing, reprompting, and in/outfilling. It starts feeling more like some weird combination of 3d modeling, photoshop, and a really advanced autofill.
if you keep prompting an artist for edits over and over again for more changes, you don't suddenly own the copyright.
the lesson you should learn is that advanced ai autofill in photoshop may lack human authorship, but it wasn't good enough to become a serious copyright issue until those tools came into existence.
It's not the same. If you make exactly the same mouse movements in paint 100 times, you will get 100 identical images. If you enter the exact same midjourney prompt 100 times, you'll probably get 100 different resulting images. The relationship between your authorship and the final image in the two cases is quite different.
your mouse doesnt make decisions for you. ML based art does, which is why it lacks human authorship and you shouldn't be able to copyright it.
If you hand painted something in photoshop 100% you can copyright it. It has human authorship. If its mostly AI based fill, those elements can't be copyrighted. if Its 100% an ai result, its public domain.
The training model for Stable Diffusion has a lot of copyrighted images mixed together into an output which makes the plagiarism non-obvious, but let's reduce the set by 1 image. Shouldn't affect the output too much, right? Maybe some prompt will have a slightly different image.
Now let's reduce it by another image. Again, less options for what to display, fewer images to take pixels from, but still a lot of options, output may sound copyrightable.
Now lets do that N-1 times. What output will we get when the model was trained on a single image, let's say an image that is labeled 'dog'. If your prompt is "an image of a dog" you will get that image, the only image in the training set. When going from latent space to image space, taking pixels from that image in the output, despite it being done in convoluted ways, is that not an obvious copyright infringement? I think it is. There's a cloud of mumbo jumbo about latent space, but after the dust settles and it needs to generate pixels in the output image, Stable Diffusion has a step that is essentially copying pixels from the source image into the output. When there's only 1 image, it will reproduce large portions of that image, necessarily infringing on copyright.
So then adding back images one by one into the training set, each one being used as source for the pixels being copied, what makes that model OK? Just because the output is 50% image A and 50% image B, or 0.1% image A and 0.1% image B and 99.8% image C, doesn't suddenly make it OK.
Once there are millions of images, you end up with just tiny blobs of pixels being copied from many different images. That's still infringes on the copyright of all those images, because it's essentially a map-reduce process that maps pixels from copyrighted images and reduces them into a single image.
This viewpoint is about as coherent as "every image file is copyright infringing because every pixel in it exists somewhere in some other image somewhere".
Derivative works, when substantially changed, are not infringing. If I take an image of the Mona Lisa and rearrange all its pixels so it looks like a picture of a cat, that's not infringement.
If I sample lines and curves and colors and styles from several images and make something new, that's not infringement.
The actual problem with image models is that they can sometimes be coaxed into outputting images that are quite similar to an image they were trained on. That constitutes infringement.
You're not a computer program and your viewpoint is about as valid as "cars don't need speed limits because most humans can't run faster than 10mph and that speed is safe".
If 1 in 100 humans could run up to 100mph you bet your ass there'd be laws against doing so around other people; it's a safety concern. Hell, even now running in most indoor or crowded areas is, if not illegal, at least considered bad behavior and may get you reprimanded or thrown out.
Some people claim to have a photographic memory. Supposing this is true, is it illegal for these people to look at copyrighted material because they may reproduce it later from the copy in their head? Of course not, it's the actual act of producing that copy that isn't allowed.
Of course, we're not talking about a computer program that stores a copy of an image and reproduces it later (that's called an "image encoder"), we're talking about is a statistical software that identifies common patterns in images and associations between those patterns and human language descriptions of the images containing them. It doesn't store or make a copy of the images it learns from, and it should only be able to reproduce images or elements of images that are overrepresented in its training data. Like any other software tool, if someone manages to use it to make an unauthorized copy of someone else's work, whether it was present in the training data or otherwise, then the user has infringed the other person's copyright. The only real argument you could make is that distribution of a trained model constitutes distribution of a tool aimed at assisting users in unlawful copying, but IMO that would apply more easily to wget than StableDiffusion.
Copyright laws were made to encourage and promote the creation and practice of useful arts. Applying them to stop the creation and adoption of a tool that would make humans far more efficient in the creation of art is backwards.
Let's do the same hypothetical that you brought up and using other people's art, but instead are model just takes 1 single pixel from 1 million images.
Taking 1 single pixel from a million images, or the first letter from every book, and putting it into a new work is transformative fair use.
Transformative fair use is legal.
> Just because the output is 50% image A and 50% image B, or 0.1% image A and 0.1% image B and 99.8% image C, doesn't suddenly make it OK.
It quite literally does! Using .1% percent of an image is legal.
The amount of work that you take from someone else is one of the 4 factors of fair use.
Yes, the specific example you gave falls under what the courts literally use right now as one of the factors!
> Once there are millions of images, you end up with just tiny blobs of pixels being copied from many different images.
This is not how these neural nets work. They don't copy pixels from anywhere. They learn features.
The features represented internally are generally not easy to interpret to humans, but for sake of illustration, there could be an artificial neuron that fires when a subject should have blue eyes. Having a lot of blue eyes in the training data would help this neuron learn better when to fire (based on the values of other neurons, which may in turn represent other features). For example, it may learn to place more importance on an input that represents pale skin or Nordic origin.
It can learn concepts like cars have wheels, and wheels are round, etc. And then when you ask it to draw a car, it composes one from the concepts it learned. Some parts of the network will deal with the fine details that more directly influence pixels, but these aren't copying pixels from any image either. They're weighing a bunch of factors (eg is this pixel part of the iris and did the network decide to make a person with blue eyes?) and choosing pixel colors based on those factors.
Thank you for the explanation. Let me explain my position in similar terms.
I'm not replicating an image, I'm "using my brain to build a network of neurons that map electrical impulses from the optical nerve excited by wavelengths projected onto my retina in order to send other electrical signals to actuator tissues".
The complexity of the process is irrelevant imo. We can treat it as a black box and look at the inputs and outputs.
If the images in the database didn't exist, it wouldn't know what to draw, and those images are copyrighted.
Everyone's welcome to take a camera, run around the world and label every object for the neural net to learn, like a human does, but model authors didn't do that because using copyrighted images for free is much easier.
You're right that if you're replicating an existing copyright image, the process doesn't matter. Legally, if you lived in a cave your whole life and never saw any art and by amazing coincidence you just happened to paint and sell the exact same painting as some other artist, you'd be violating their copyright. Independent creation doesn't protect you.
On the other hand, under current copyright law, if Stable Diffusion generates an original image that doesn't look like a copy of any existing image, it's clear the new image doesn't violate any artist's copyright.
The debate is whether you can use copyright images/text to train an AI.
Stable Diffusion is of course trained on millions of photos of the real world, in addition to images made by artists. Of course, humans artists also see and digest both the real world and images by other artists and both influence their output. That's why you get trends like impressionism.
You are describing transformative use, which is permitted. Otherwise I could create a picture with every possible RGB pixel and then claim all other artists are infringing on 0.1% of my work.
It is impossible to 1:1 replicate the input as an output because the images are not stored. It isn't a database. It's basically aggregating summaries/abstractions/generalizations of a bunch of tags.
I personally feel the mere fact that it was fair use of mostly copyrighted images it's fairly self evident that anything produced from it should not be copyrightable, UNLESS the origin of the art used to train the model is 100% owned by the "artist". This could either via licensing for that purpose or they own the copyright to, that right not extendable to corporations, as a corporation can't be an artist. It doesn't matter how complex the prompt or series of prompts are, the key here is that the "artist" either owns the training material or licensed it through the proper chain of licensors.
I'm baffled at how someone in their right mind still argues as if using any other tool is free of copyright infringement of some kind.
It also boggles my mind how, in our line of work (which is often artistic in its own right), a lot of people make preconceptions on how art is made. Often reducing it to nothing but transformative generation. Such takes are deeply narcissistic, and downright wrong. At this point I'm led to believe they're AI generated.
Okay but does that include the seed? The sum of the input model datasets? I think a better scenario is no copyright for any part of this process. Give humanity what they’re going to take anyway- open access to this tech.
No, you're passing inputs to a program, and your description completely omits the vast majority of that input: the creative output of an unknown number of other people, whose rights you are attempting to launder.
Thats nice and all. But it has nothing to do with whether something is copyrightable or not.
Instead, something is copyrightable based on the amount of human input into the process.
And it is very clear that there can be a lot of human input into AI image generation.
Even though I will concede that going into midjourney and just typing in "Hot anime girl" isn't a lot of input and likely doesn't deserve copyright protection.
But there can be so much more to AI art than the boring case of low effort mid journey prompts.
So then yes, it is about the human input into the process. Thats what I just said.
Having large amounts of human input is the thing that matters for this stuff, which is the case for many forms of AI art.
In the same way how photoshop uses a computer, and the computer creates the art, the resulting computer generate art can still have copyright protections. (because of the large amount of human input, even though yes it used a computer)
The product of your human work is the input, not the ai generated image. You are merely commissioning some system to do the work on which you have very little actual future authorship except the commission. For art we don't grant the copyright to someone commissioning work, they have to negotiate with the artist for that.
But your artist is not human, so cannot create copyrightable works, so you can't even bargain for the right. Your copyrightable prompt just created a public domain result.
As for your comparison with photoshop, you have it backwards. The lesson you should learn is that if you fill portions of a work using something that starts authoring parts of the image, you should lose the ability to copyright those parts of the image, because you didn't author them.
Just like other works that are a mix of public domain and copyrighted elements, you can only copyright the human authored work. It's like making a comic with AI pictures - the images themselves are public domain (assuming you haven't forgotten to license the use of those works in your computer system that generates the image for you), you assembling the work into a comic is what you own the copyright to.
the characters and images designed by a machine remain public domain, no matter if you prompted all of them.
> The lesson you should learn is that if you fill portions of a work using something that starts authoring parts of the image, you should lose the ability to copyright those parts of the image, because you didn't author them.
In your opinion then, using this same line of logic, Photoshop is not protected.
The courts disagree with you though.
Using your line of logic, you could say that the computer is authoring the work using Photoshop.
It is the computer printing out the picture using bits and bytes. That's not a human! That a computer program named photoshop!
Then follow the exact line of logic from there.
> characters and images designed by a machine remain public domain,
We know this to be false though, because a character created by Photoshop is done on a computer though.
Therefore, it is clear that yes a machine can be used in the process, unless you are going to claim that Photoshop is not protected because it is run on a computer.
I used words very carefully. It depends on the level of human authorship.
If you use a tool to correct some pixels thats directed by a human closely, no thats not a problem.
If you remove large parts of the image with the ai erase fill, then you've given up authorship of those parts of the image. You could then go in and author changes to the work that you could further add to your copyright. But you would never change the copyright status of the stuff authored by a machine.
You're using 'Photoshop' on a naive level without looking at the most important element - how much authorship is the human having. You can basically 'hand paint' a picture in photoshop, or you can use the ai tools and have almost no authorship.
Then I am happy to use your word if that clarifies things.
Just replace everything that I said about "human input" with "human authorship".
And my point is that there are many things that a human can do using AI art that have large amounts of "human authorship" beyond just the boring case of prompting midjourney with a dumb prompt.
> It depends on the level of human authorship.
Oh hey! Yes that is exactly my point.
That point being that just like Photoshop images are copyrightable, because there is human authorship, so can AI art, if there is human authorship.
Glad you agree.
> thats directed by a human closely
Ok! You agree with me then! That's my point!
My point is that AI art can be directed by a human closely and that there is so much more than can be done than a simple prompt into mid journey.
You agree with my central point.
> If you remove large parts of the image with the ai erase fill, then you've given up authorship
Not if you "direct it closely"! Then it's protected.
> how much authorship is the human having
That's is exactly what I am talking about though, that I have said multiple times.
That there are lots of things that a human can do, related to AI art that are directed closely, and that these things are authorship.
But I am glad that you agree with me that if it is directed closely then it is protected, which was my point and that you can do this with AI.
Its clear from this post that you aren't at all here in good faith, just to be glib and purposely misconstrue other people's words. good luck with life.
You can "hand paint" in Photoshop, but you can also assemble collages composed of bits and pieces of other copyrighted works and create a result that is still copyrightable. Why is the latter currently legal in your opinion?
if the things you're collaging can't be copyrighted because they lack human authorship, you can only copyright the arrangement, not the things you're arranging.
If they have human authorship, its the original human artist's work you're collaging and now you're creating an unauthorized derivative work violating many copyrights.
If you do not license the "bits and pieces" then the courts will find you in violation of their copyrights. Pretending AI is different is bizarre out-of-touch "but I'm so special" solipsism.
I believe you are wrong there. Transformative art requires no licensing as long as it falls under fair use.
I can quote a book in my article without licensing the quote from the author. I can clip eyebrows off of copyrighted magazine portraits and assemble them into a eyebrow version of some famous art piece and never have to license a thing. I can take screenshots of copyrighted YouTube videos and assemble a "shirts of YouTubers" that I lasso tool'd and collaged together and create an entirely new copyrighted work without having to license a thing. I can take a photo of a street which contains an art gallery and copyrighted art can appear in my photo without having to license anything from the artist. Fair use would cover taking 1 out of 1 million pixels and assembling it into a new image if a human were to perform that action.
Sampling is an art form if properly attributed (and possibly even without; whole genres are built on the premise), especially if it elevates the original work.
That's not to say copyrighting ML derived creative works is the way forward, but that creativity has bearing wherever a 'medium' can be manipulated.
It's total BS to me that a book of midjourney generated images is itself copyrightable because a human arranged the book together, but that a highly sophisticated prompt involving custom tooling wouldn't be.
If nothing else, my comments should show the US Patent Office how deep the rabbit-hole goes with just how interpolatable everything is with everything else.