A checkpoint trained without copyrighted material?

#15
by MatthewWaller - opened

I would like to be able to respect the copyright of artists whose work was used to train the model. If possible, I would like a checkpoint that is trained only on non-copyrighted images. I think this might be the only solution πŸ˜”

I've thought of giving part of the proceeds of my use of Stable Diffusion to artists in some manner, and I've thought of setting thresholds for how likely a work belongs to an artist, and I've thought of limiting artist names in the text prompt, but nothing would work as cleanly as a checkpoint without copyrighted images.

It would also allow for other developers to have an ethical option as Stable Diffusion proliferates.

I think the answer is to treat these as a commons, and what we create. I agree with not making money off artists without their agreement, but expanding the world of non commercial art is only a good thing. Use SD, and use it for parody, education, empowering small creators, and undermining the big IP holders who will probably start training models off their huge silos anyway, to the detriment of everyone.

I'm under that impression that, between outright copyfraud and missing copyright notices, whether any single image you find on the 'net is copyrighted or not - or copyrighted to whom - is difficult to determine, at the best of times.

A presumably crowdsourced and individually vetted dataset of non-copyrighted and copyright-expired images uncomplicated by things like fair use or freedom of panorama will still contain a non-trivial portion that's still copyrighted which would, by your definition, make any model trained on the dataset less-than-clean and less-than-ethical to use.

Much further down this slippery slope, a strawman argument might be made that human artists are also liable of losing their ability to produce clean and ethical art, as soon as they see or (Gasp!) trace someone else's picture that's copyrighted for practice, sweat-of-the-brow doctrine notwithstanding.

Your concern is understandable, but society will have to work out a solution somehow.

And it's incredibly important to understand that copyright law has no relationship to morality whatsoever. Artists regularly have their art taken by employers, record labels, publishers, etc and that's entirely in line with what copyright law is supposed to do. Many artists are calling for AI to not be trained on copyrighted works and instead "only on dead artists." They don't realize that copyright applies for not just the entirety of an artist's life but 90 years after death. People's moral alternative is illegal under copyright law, just like Disney wishes fair use was illegal. There's nothing to be gained for anyone, artists or otherwise, from strengthening copyright or applying those standards to AI. The only change I think would be a good idea would be that any model trained on public/non-agreed data has to be publicly shared, so DallE would have to share their model and StabilityAI could not restrict access to 1.6+. It's also important since companies are trying to use universities to get around laws that might restrict training, and then attempting to take the work private rather than making anything more than the paper public.

Do you think creating your own art without the help of AI is acceptable, despite the fact that you've surely seen and been inspired by loads of copyrighted material? I think it's safe to assume your answer to that is yes. In which case, how does using a digital brain rather than a biological one make it morally any different?

Sign up or log in to comment