Over the past ten years, Artificial Intelligence (AI) and Machine Learning (ML) have steadily crept into the Art Industry. From Deepfakes to DALL·E, the impact of these new technologies can be longer be ignored, and many communities are now on the edge of a reckoning. On one side, the potential for modern AIs to generate and edit both images and videos is opening new job opportunities for millions; but on the other is also threatening a sudden and disruptive change across many industries.

The purpose of this long article is to serve as an introduction to the complex topic of AI Art: from the technologies that are powering this revolution, to the ethical and legal issues they have unleashed. While this is still an ongoing conversation, I hope it will serve as a primer for anyone interested in better understanding these phenomena—especially journalists who are keen to learn more about the benefits, changes and challenges that that AI will inevitably bring into our own lives. And since the potential of these technologies—and the best way to use them—are still being explored, there will likely be more questions and tentative suggestions, rather than definite answers.

In this article I will try to keep a positive outlook, as I feel is important to show and inspire people on how to better harness this technology, rather than just demonising it. And while predicting the future is beyond the scope of this article, there will be plenty of examples of how new art practices and technologies have impacted art communities in the past.

Introduction

Back in 1998, I vividly remember my Music teacher venting his utter frustration at the class, once he found out that some kids had been using Magix Music Maker and eJay to create their own songs. I remember feeling confused, as I struggled to understand why a tool that enables people to create more music should be frowned upon by a Music teacher, instead of being welcomed. His job was literally teaching us about music and composition, so why be upset about those music production tools? Because not only he suddenly felt redundant, but more importantly he saw people with much less experience and musical abilities than him being able to create things he was unable to. If initially I was confused, after that realisation I felt sorry for him. Almost twenty-four years have passed since that day and no, Magix Music Maker and eJay have not destroyed the music industry. Quite the opposite, lowering the barriers to entry into music production has blessed us not just with more music, but with an explosion of new genres, styles and melodies. And ultimately, it allowed many more people to express themselves and tell their stories through their music.

While this is nothing more than a personal story, it encapsulates very well a common phenomenon that cyclically occurs pretty much in any industry that is—or thinks it is—on the brink of deprecation. In the past few years, more and more artists have been expressing their concerns about the sudden rise of AI tools, and what they are capable of.

Like every new technology, AI has the power to both disrupt the existing ecosystems, as well as creating new opportunities for millions of people. Artists blindly dismissing the latter are perpetuating a form of gatekeeping, which ultimately hurts the very communities they are trying to protect.

The first part of this article will look at some of the most popular and discussed uses of AI when it comes to image editing and generation. The second part will look at their criticalities and address some of the most raised concerns. Above all, the fact the dataset used to train modern AI models includes—among many other things—copyright-protected content which has been used without the consent of the respective artists. Finally, the third and last section will try to give a positive outlook on how to best harness and use these technologies.

🛠️ A list of other AI-powered tools

📰 Ad Break

Part 1: A Brief Timeline of AI Art

It is worth mentioning that technology has always played a critical part in the development of Art, in any of its form. From the invention of colours through Chemistry to the discovery of fractals through Mathematics: Art, Culture and Technology are three dimensions that cannot be fully separated.

Computers are not an exception, and they have been used to assist artists since their very beginning, often revealing a beautiful complexity that would have otherwise escaped our eyes.

One such example is the famous BASIC one-liner, which truly reveals the hidden depths of programming languages:

10 PRINT CHR$ (205.5 + RND (1)); : GOTO 10
10 PRINT executed on a Commodore 64 (source)

The fields of Digital and Computational Arts are complex and fascinating, as complex and fascinating are the techniques and technologies they use.

It is easy to dismiss 10 PRINT—and all similar programs—like nothing more than ingenious sketches. And this is why is important to remember that every new edgy technology loses such status as rapidly as it becomes part of our daily lives. We now take colour palettes for granted, often ignoring that it took the work of thousands of people over several millennia to ensure we could have the selection of pigments we have today.

When it comes to AI, this is no different. There are many AI-powered tools and technologies that artists use every day which have been seamlessly integrated into their workflow. This is why the term AI Art is—like the term AI itself—somewhat misleading. Artificial Intelligence is and always will be an integral part of every artist’s work who is relying on modern technologies to create their pieces. What changes is often what we are willing to consider “AI”, over simple craft or engineering.

However, the term AI Art is currently associated with a specific set of technologies that relies on Deep Neural Networks and Machine Learning to process images and videos. Let’s take together a quick journey to recap the ways in which AI has been impacting the Digital Arts in recent years.

Deep Dreams (2015)

One of the first modern examples of AI Art is, without any doubt, deep dreams. They became popular in 2015 thanks to an article titled Inceptionism: Going Deeper into Neural Networks. Their original purpose was to investigate how neural networks are able to detect patterns in an image. While the architecture of a neural network is designed, a lot of its inner workings can sometimes be hard to decipher as it is the result of an optimisation process known as training.

Examples of deep dreams (source)

What makes deep dreams so interesting is that they created a novel and unique art style which reveals a lot about how neural networks, often considered inscrutable “black boxes“, actually work. If you want to learn more about how deep dreams work, I would highly suggest the article Understanding Deep Dreams.

👥 The architecture of deep dreams

Neural Style Transfer (2015)

The publication of the first deep dreams allowed many researchers to work on new techniques that— thanks to neural networks—could treat images as more than just a mere collection of pixels. One such technique was described in A Neural Algorithm of Artistic Style, a 2015 paper that used Convolutional Neural Networks to redraw images in the style of a given painting. This, and all similar techniques which are able to “transfer” styles using neural networks are now commonly referred to as neural style transfer.

The technique works by finding an image with large-scale features similar to the input, but with small-scale features similar to the ones from the style we want to copy. In doing so, the original authors of the paper expressed their interest in understanding how the human creative process works:

«In light of the striking similarities between performance-optimised artificial neural networks and biological vision, our work offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.»

A Neural Algorithm of Artistic Style (source)

Something interesting about the original Neural Style Transfer technique is that in order to “transfer” a style, it does not need to be trained with images of the same style. This means that it can operate even with paintings it has never been trained on or seen before. This is known as one-shot learning, and it is possible when AI models reach a certain degree of complexity. It is an important point to remember, because a lot of the backlash that AI Art is currently receiving is deeply connected with the fact that many models can replicate copyright-protected content because they are trained on copyright-protected content to begin with. This is not necessarily the case, as the issue will be expanded on later in the section dedicated to Copyright.

Deepfakes (2017)

The conversation around AI-powered photo editing took a dark turn in 2017, when the so-called deepfakes gained popularity on the Internet. While the term originally referred to a specific deep learning technique, it is now used to generally refer to any face-swap algorithm powered by deep learning and neural networks.

A deepfake of John Krasinski as Captain America (source)

In a nutshell, deepfakes are able to replace someone’s face in a video, preserving the original expressions and speech. The first photorealistic examples released were used to create adult videos of celebrities. This sparked very heated—and often disingenuous—discussions about the use of this technology. As a result, the term “deepfake” now appears to be forever tainted and is rarely used in any positive context.

Despite that, deepfakes—and the adjacent technologies—have incredible potential in the entertainment industry. For instance, they could be used to replace the expensive makeup and prosthetics used by actors and body doubles, or even to automatically dub movies in other languages.

And, in a rather controversial way, could even be used to “digitally resurrect” deceased actors for posthumous cameos. The latter has already occurred several times in the movie industry—even without deepfakes—raising ethical and legal concerns.

If you are interested in learning more about deepfakes—how they work, how to create one, and when not to do it—I highly recommend the series An Introduction to Deepfakes.

👥 The architecture of Deepfakes

StyleGAN (2018)

Another technique that has become increasingly popular in the field of Deep Learning is Generative Adversarial Networks or GANs. In this architecture, two neural networks are trained against each other: while one learns to generate similar images to the ones it was trained with, the other learns to detect which ones are original. When trained properly, GANs learn to create new images which are virtually indistinguishable

One of their first use which got media attention was StyleGAN, first proposed in a 2018 paper titled A Style-Based Generator Architecture for Generative Adversarial Networks. StyleGAN made the headlines also thanks to the brilliant This Person Does Not Exist, a website that generates a picture of a new person at every refresh. As the name suggests, all of those highly photorealistic images are generated using a neural network, and none of those people really exist.

The website has been so successful (and its performance so easy to replicate) that a number of similar websites have spawned in just a few weeks, generating cats, horses, chemicals, houses, fursonas, waifus and even dickpics (which I am not linking as it truly is nightmare fuel).

Small variations in these architectures allow for fine controls over individual features. For instance, it is possible to transfer the hairstyle, the ethnical background, and even to “blend” two people together as shown in a 2019 paper titled Image2StyleGAN:

Image2StyleGAN (source)

Traditional video and image editing tools could only treat images as a collection of pixels. Neural networks have the capacity to learn hierarchical, semantic features. This makes them aware of semantic structures in a way that traditional tools cannot. In a nutshell, AI tools understand what is inside an image, and can be used to match and edit features, rather than mere pixels.

Diffusion GAN used to match faces (source)

Similar techniques can also be applied to subjects other than portraits. CycleGAN, for instance, showed how to perform what they called image-to-image translation in order to swap specific aspects of an image. Such as changing zebras to horses, or changing a landscape from summer to winter.

Image-to-Image Translation using CycleGAN (source)

👤 The architecture of StyleGAN

text-to-image (2021)

The single piece of technology most of the readers are probably interested on, is the so-called text-to-image (sometimes text2image): the possibility of generating images from a short description, known as a prompt. At the time of writing, a variety of different products are available for the general public, with the most popular being:

among a few others, such as Craiyon (formerly DALL-E mini).

Each one of those products works in a different way, but they all serve the same purpose: conjuring images in seconds:

It is not unsurprising that the arrival of this new technology has effectively divided its audience. On one side, many are ecstatic about the new possibility these tools are offering. On the other, many artists expressed their deep concerns about how this technology might negatively impact their ability to find a job.

The problem goes even deeper, as the AI models that power those tools are all trained on large bodies of images that have been scraped from the Internet. While all of the data used was publicly available on the internet, it was not all in the public domain. AI models were, in fact, also trained on material that was protected by copyright, raising several ethical and legal challenges. This is a very complex topic that will be expanded on later in the article.

Some people also raised concerns about the possibility of art becoming little more than writing prompts into a text box. To be honest, that is a rather naïve take on this technology. Given the current direction, it is likely AI will progressively become more relevant in the workflow of most artists. But that is nothing new, as technology has always and always will influence how art is created.

🖼️ DALL·E inpainting and outpainting

Virtually all of the text-to-image tools currently available are based on either one of the following technologies: transformers and diffusion models. The former is used for DALL·E, while the latter is for DALL·E 2, Midjourney and Stable Diffusion.

📝 The technologies behind DALL·E

🎯 The technologies behind Stable Diffusion

ChatGPT (2022)

One of the biggest revolutions of the past few years, however, is neither DALL·E 2 nor Midjourney: it is ChatGPT. The GPT in the name stands for Generative Pre-training Transformer, and is a callback to the type of architecture and technique behind this application. ChatGPT is trained on large bodies of text across different subjects, and has been proven incredibly capable of understanding human language, performing exceptionally well over a variety of different tasks. ChatGPT is effectively passing the Turing test, meaning that it is (often) virtually indistinguishable from a human.

🔎 The Turing Test

Chat GPT can understand complex sequences, regardless of their context. This means that it has been effective in creating not just text, but also music and code. Although most people are thinking about images, when referring to AI Art, ChatGPT presents the same ethical challenges and issues that other image-based models have. For this reason, it deserves a mention in this article.

Compared to text-to-image techniques, ChatGPT (and its competitors) will likely change the way most of us work. This is because they can offer human-friendly interfaces for most processes that would have otherwise required expert knowledge.

AI tools will soon revolutionise more than just art (source)

Sam Altman, CEO of OpenAI thinks tools like ChatGPT will have a big impact on the way we work:

«It’s an evolving world. We’ll all adapt, and I think be better off for it. And we won’t want to go back.»

Sam Altman, CEO of OpenAI on ChatGPT (source)

If you are interested in learning more about ChatGPT, I would highly suggest What Is ChatGPT Doing … and Why Does It Work?. A more opinionated piece on its perception of correctness is ChatGPT is a blurry jpeg of the web. If, instead, you would like some general guidelines on how to make the most from ChatGPT while reducing any possible risks, you can find a talk just about that on my channel.

📰 Ad Break

Part 2: Criticisms & Criticalities

In this section, we will go through some of the most common criticisms, concerns and criticalities that modern AI models are under fire for.

  • 2.1: Is AI-generated content art?
  • 2.2: Are we on the edge of a content inflation?
  • 2.3: Is AI going to make my job useless?
  • 2.4: What about the copyright issues?
  • 2.5: Consent and AI-generated content
  • 2.6: Implicit Bias
  • 2.7: The Human cost of AI

2.1: Is AI-generated content art?

Let’s tackle the first—and most vapid—criticism that AI art often receives: it is not real art. First of all, we need to properly define what pieces of work are to be considered AI art. The term is probably most often associated with images generated from prompts using AI models such as DALL·E 2, Stable Diffusion and Midjourney. However, many pieces that have a strong connection with the Computational Arts (such as fractals and procedural content) could easily fall into the category of AI art.

As a game developer, I find the idea that AI art is not art incredibly weak. It took years and years of discussions to finally see games recognises as a valid artistic medium. Not all games are art pieces, of course, but making games is a valid medium for an artist to express their creativity. And, in some cases, games can indeed be pieces of art on their own (even ignoring the fact they contain many traditional pieces of art such as digital painting, 3D sculptures, musical compositions, etc).

Claiming that AI-generated art is not art, ultimately requires an objective definition of what can and what cannot be considered art. Such a question keeps resurfacing every generation, often from people who are unfamiliar with either Art history or Computational Art.

For centuries people have been trying to impose strict constraints on what a piece of art should have to be worthy of that tile. And for centuries, artists have defied and subverted those expectations. Modernism and Surrealism, for instance, would be quickly dismissed by someone who had only been exposed to Renaissance art. And yet, their value is not in how good the art pieces look, or how long it took to make them. The reality is that the very same arguments that are used to prove that AI Art is not art, could be very well twisted back to claim that digital artists are not real artists.

Let’s look at three relatively modern examples of art pieces that were, at the time of their release, discredited for one reason or another:

  • “Fountain”: a porcelain urinal that artist Duchamp proposed as part of an exhibition in 1917;
  • “Théâtre D’opéra Spatial”: an AI-generated piece by Jason Allen who recently won an art competition;
  • and “TRON”: a 1982 movie that was disqualified from receiving an award the Best Visual Effects Academy Award as it used digital effects.

🚽 Fountain by Marcel Duchamp

👽 Théâtre D’opéra Spatial by Jason Allen

🏍️ TRON

2.2: Are we on the edge of a content inflation?

In the past few months, the Internet has been literally flooded with AI art. This was possible because many tools like Midjourney and Stable Diffusion were made available to the public. As a new trend, it is not unexpected to see such a sudden interest from a generalist audience.

Dr Kate Compton coined a term for this: Bach faucet. This is a reference to a 2010 article about composer David Cope (source), who was able to procedurally generate thousands of chorales in the style of Bach.

«A Bach Faucet is a situation where a generative system makes an endless supply of some content at or above the quality of some culturally-valued original, but the endless supply of it makes it no longer rare, and thus less valuable.»

Dr Kate Compton (source)

The idea is simple: the value of something often correlates with its rarity. At the exact moment you can generate (fauceting?) and endless stream of Bach-like chorales, they instantly become almost worthless. As a game developer, I am very familiar with this concept, since many games are featuring procedurally generated content of some kind. No Man’s Sky, for instance, had a difficult launch in 2016 also due to the severe lack of variation among its 18.4 quintillion explorable planets. It’s a lot of planets, but they all felt and looked a bit the same.

No Man’s Sky

This in no way means that Procedural Content Generation is bad: just that is not an easy way out for lazy developers to create infinite content. Being able to create good levels procedurally requires both the competencies of a level designer and a programmer. And when done properly, it greatly enhances the playability (and replayability!) of a game.

A similar trend has happened already—and continues to this day—in the Game Industry as a whole. The past decade has been blessed with an explosion of games; not just in terms of sheer number, but also in quality and diversity. This sudden change was in part due to the fact that game engines like Flash (RIP), Unity and Unreal lowered the entry barrier to making games. This is a phenomenon that Unity itself referred to as the “democratisation of game development“. I am a strong believer this has been an incredibly positive change in the industry, as it allowed more and more people who did not have a Computer Science degree to make games and get one foot in the door of Game Development.

However, the impact has not always been entirely positive. Because more people were able to make games using Unity, the number of objectively bad Unity games has skyrocketed in recent years. This is so out of control that “it’s made in Unity” is now a point of contempt for some gamers, which incorrectly associate the inexperience of the developers with a fault of the engine. In this specific context, the problem had also been made worse by the fact that Unity requires a license to remove its splash screen; as a result, you typically see the Unity logo on zero-budget games, further strengthening gamers’ mistrust. With so many titles, the market has become quickly saturated. If back in 2010 making a game with a cool idea was all it took, now a successful launch requires careful planning and a significant investment in marketing and advertisement—not to guarantee its success—but to avoid its failure.

Exactly the same thing is taking place right now, between the AI and Art communities. Tools like DALL·E, Midjourney and Stable Diffusion are giant Bach faucets for things like concept art and stock images. Right now, they are novel enough that they can threaten a significant number of jobs. But in the same way people without any art-related background are able to access these new technologies, so are the people who have studied art and have worked in the industry for years.

As the novelty fades away, I imagine we will see fewer and fewer random AI posts. On the other hand, it is likely to imagine how AI-powered tools will play a bigger role for artists across all fields.

🛠️ AI as a tool for artists

2.3: Is AI going to make my job useless?

I vividly remember the speech that I was given during induction week at my high school. If you want to work in the field of computer science, you will never stop learning. At the time that sounded rather scary, but I now see that with different eyes. As the industry keeps growing and advancing, people working with computers are forced to keep up with new practices, workflows, tools and technologies. It is a small price we pay to live in an era blessed with progress. And as a game developer myself, it is hard to ignore the immense progress games—just to name an industry—have done in the past 30 years alone.

On the other side, it is understandable that such a pace can indeed feel tiresome. This is especially true for people who have invested time and money to train in a specific field, to see most of their know-how wiped out over the course of a few years or months. The same has happened to me—and many of my colleagues—when Flash died, leaving so many talented people not just without their preferred tool, but without their job.

A similar change if about to happen right now due to the disruptive innovation that AI-generated content is bringing to the table. Saying that nobody is going to lose their job would simply be false; but so is claiming that AI is “stealing” people’s jobs. One of the biggest misconceptions I have heard about AI is that it removes humans from the act of making; that could not be further from the truth. According to DataProt (source):

  • 37% of businesses and organizations employ AI as of 2019
  • The rise of AI will eliminate 85 million jobs and create 97 million new ones by 2025

It is easy to see how AI, ML and adjacent fields have created millions of new jobs across all industries, in the same way the Internet did over the past two decades. Moreover, lowering the entry barrier to certain jobs will enable more and more people who did not have the chance to invest in formal education to access higher-paid positions.

At the same time, it is true that some people will be replaced. The artists who will suffer the most, will be the ones who fail to recognise this new paradigm shift. But it would be disingenuous to put all the blame onto the artists themselves. Many will, as a matter of fact, lose their job simply because their employers have found a cheaper way to get similar values using AI tools. And it does not really matter if those tools are not as good as their human counterpart: they do not need to get 100% of the job done to be cheaper.

«AI will not replace you. A person using AI will.»

Santiago Valdarrama (source)

Can AI replace artists? Of course it can! But that is not the right question to ask. The challenge that AI brings is not about being replaced, but about keeping up with the changes. How will you, as an artist, embrace those new tools, and how can you use them to express your creativity and vision in a way that someone without your talent, know-how and experience can? If all you can offer as an artist is simply crafting, then it is likely that part of your job could be easily replaced by a machine. But if what you bring has a deeper artistic value that escapes the constrained boundary of a skilled craft, then your job is and will always be safe as long as you are willing to reinvent yourself and push the boundaries of any new technology.

📷 How Photography shaped Art…

2.4: What about the copyright issues?

One of the most discussed controversies about AI art revolves around copyright. There are three important questions that need to be answered:

  • Who owns AI-generated images?
  • Is training AIs on copyrighted images copyright infringement?
  • Can AI art breach copyright if it recreates something protected by copyright?

Together, they also raise another question:

  • Who is responsible for the content that an AI generates?

Before these questions can be addressed, it is important to remember that copyright is a complex and multifaceted issue, and that legislation surrounding it is continuously being updated in response to new technologies. And to make things even more complicated, different countries might be subjected to different legislations. Given the current climate surrounding AI art, it is not unexpected to see some real changes happening soon. Because of this, some of the pieces of information in this article might not be necessarily up-to-date.

Who owns AI-generated images?

On the surface, answering this question appears to be fairly straightforward as ownership is a legal issue, and needs to be resolved within a legal framework.

For instance, if you are the creator of your very own content generator in un UK, the Copyright, Designs and Patents Act 1988 explicitly states that you also own whatever the machine creates:

In the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken.

Copyright, Designs and Patents Act 1988, Section 9 (source)

Developers working in the field of procedural generation will definitely be relieved to hear that most countries are aligned with this policy. This means that you typically own the rights to anything that is created by a system that you created.

The conversation becomes cloudy when we start discussing modern AI systems. Who owns what they create? Is it the institution that designed, implemented and trained those systems, or the people who used them? In other words, who is the “person by whom the arrangements necessary for the creation of the work are undertaken“? This is an important question for artists, who are often creating their own art with the aid of a piece of software such as Premiere, Photoshop and Unity. Such a question is almost always resolved in the Terms of Use of each individual software.

🐒 Who can legally hold copyright of something?

DALL·E 2 Term of use (as of December 2022), for instance, clearly states that users retain ownership of both the prompts and generated content:

You may provide input to the Services (“Input”), and receive output generated and returned by the Services based on the Input (“Output”). Input and Output are collectively “Content.” As between the parties and to the extent permitted by applicable law, you own all Input, and subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. OpenAI may use Content as necessary to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms.

DALL·E 2 Term of use (December 2022) (source)

Such wording seems very progressive, but it comes with a catch. By giving up ownership, OpenAI also entails that it is the user’s responsibility to ensure the content does not violate any laws. And if it does, it is up to them to take the necessary actions.

Midjourney Terms of Service (as of August 2022) also express a very similar concept, stating that “Subject to the above license, you own all Assets you create with the Services“. However, it also states that by using Midjourney you also give them irrevocable copyright so that they can reproduce, change and sublicense anything you make with it.

The bottom line is that each software will have its own terms of use, which you should read carefully before using. If you are interested in learning more about the legal aspects of copyright in the context of AI, I would highly suggest reading AI generated art – who owns the rights?.

The problem here relies on the fact that the AI models have been necessarily trained on images that are protected by copyright, without the direct consent of their respective authors. Whether or not this makes their usage and commercialisation legally problematic, is something that is currently being debated.

Is training AIs on copyrighted images copyright infringement?

Out of the many arguments people are using against the rise of AI art, this is possibly the most solid and concerning. Can AI be trained on copyrighted art?

To answer this question, we first need to understand how modern AIs are being trained. In order to learn how to create images from text, every AI model needs to be trained using millions of labeled images. DALL·E 2, for instance, was trained on over 650 million labeled images from a private dataset (source). Stable Diffusion, instead, was trained on 5.85 billion labeled images from LAION-5b, a publicly available dataset. The same applies to text generators such as ChatGPT: the model it is based on, GPT-3, was trained using 570 gigabytes of text data.

The LAION dataset (source)

The sheer size of those datasets indicates the volume of information that those models need to crunch in order to get such good results. The reality is that pretty much all datasets of this scale are created by scraping the internet. In a recent interview with Forbes, David Holz (founder of Midjourney) confirmed that even the dataset used for his AI model was scraped from the internet:

How was the [Midjourney] dataset built?
«It’s just a big scrape of the Internet. We use the open data sets that are published and train across those. And I’d say that’s something that 100% of people do. We weren’t picky. The science is really evolving quickly in terms of how much data you really need, versus the quality of the model. It’s going to take a few years to really figure things out, and by that time, you may have models that you train with almost nothing. No one really knows what they can do.»

David Holz, Founder & CEO of Midjourney (source)

This obviously raised ethical and legal concerns, as a significant part of that data is protected by copyright. In that same interview, David Holz acknowledges the problem, but it also confirms that there are no tools in place right now to get around that:

Did you seek consent from living artists or work still under copyright?
«There isn’t really a way to get a hundred million images and know where they’re coming from. It would be cool if images had metadata embedded in them about the copyright owner or something. But that’s not a thing; there’s not a registry. There’s no way to find a picture on the Internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.»

David Holz, Founder & CEO of Midjourney (source)

So the question remains: is it legal to scrape data that is protected by copyright? It is unfortunately impossible to give a definite answer, since different countries have different laws and guidelines. On top of that, this is a fairly new problem and legislators are not typically known for their celerity. The best we can do is to highlight the current policies regarding data scraping and data mining around the globe.

🇺🇸 Data scraping in the US

🇬🇧 Data scraping in the UK

🇪🇺 Data scraping in the EU

The US, UK and EU currently have fairly permissive regulations, which mostly allow for researchers to safely scrape data without any real repercussion. As disappointing as it is, there is currently no standard way to check if a piece of content that is publicly available on the internet is protected by copyright, and to verify who it belongs to. Asking to get permission from every single author would simply strangle not just AI Art, but a large branch of Deep Learning research in its entirety. And it is very important to notice that doing so might not necessarily work in favour of all artists and creators. Digital Artists working in the field of Procedural Content Generation and Computational Creativity are artists themselves, and there is a serious risk that enforcing protections against AI models and their companies will, overall, have a negative impact on both their artistic output and the wider innovation. This was one of the reasons driving the recent government consultation in the UK surrounding AI and Intellectual Properties:

It is unclear whether removing [protections for Content-Generated Works] would either promote or discourage innovation and the use of AI for the public good.

Artificial Intelligence and Intellectual Property: copyright and patents: Government response to consultation (source)

The reason why governments are taking this seriously is that AI Art is just one of the many applications of those large models. They could, in fact, power an entirely new generation of tools. From self-driving cars to space exploration, from automatic medical diagnoses to drug development, and from CCTV surveillance to online fraud detection. It is not difficult to understand why different countries are deeply interested in seeing those tools being not just developed, but actually deployed. And that the first ones to properly integrate AI into their procedures will have a massive advantage over the others.

However, the problem of sourcing data ethically remains one of the biggest challenges that AI models are facing now. On some platforms, the #noAI tags are becoming more and more popular, in the hope that future scrapes will be able to exclude such content from their datasets. But without a uniform and agreed solution, this is unlikely to have any real effect, and likely bears no real legal basis.

One thing is important to remember is that any sufficiently good AI models will necessarily need to be trained on copyrighted data. And there are two solid reasons why: cultural awareness and diversity.

The first is about cultural context: for decades we realised that any machine sufficiently good at passing the Turing test will necessarily need a solid understanding of the common sense and cultural contexts. Art, celebrities, TV shows, comic books, and even memes are more than mere by-products of our society. They are the cultural pillars which the next generation is built upon. They heavily influenced our language, and have deep impacts on us on a developmental, psychological and sociological level.

The second reason why AI models likely benefits from being trained on scraped data, is that any method that requires permission is likely to severely reduce not just the amount of data that can be accessed, but also its diversity. Realistically, only privileged artists would be able to opt-in, de-facto excluding other demographics. There is the possibility this might introduce severe biases, such as US models only being able to generate and recognise white people, or being oblivious to the art movements, styles and techniques from different regions and cultures. Since those AI tools will be inevitably integrated into decision-making tools, this will likely cause them to further reinforce biases and stereotypes.

Wherever you stand on this, it is important to acknowledge that the issue of copyright in datasets reaches far deeper than just AI Art. And that any future decision or legislation will have a severe repercussions not just on the Art communities who demanded them, but on the wider society as well.

If this leads to an answer you are not happy with, there is little you can do—as an individual—to change that. But a collective discussion can be powerful enough to demand an overhaul of the legal system surrounding copyright and ownership, which were designed before the advent of AI and ML. I would not be surprised if those pieces of legislation will be heavily subjected to reviews in the near future, in order to find a reasonable compromise between innovation and copyright protection. But it is also important to see that there is no magic fix for this.

👨‍⚖️ The Stable Diffusion Litigation

Can AI art breach copyright if it recreates something protected by copyright?

Let’s start by making something clear: AI models can totally reproduce artworks that are protected by copyright. Especially when they are explicitly asked to do so! Typing “the mona lisa” as a prompt will likely recreate variants of the Mona Lisa so plausible that the untrained eye might not be able to distinguish them from the original. A recent study titled Extracting Training Data from Diffusion Models investigated exactly this issue, providing several concrete examples (below).

An example of how Stable Diffusion can reproduce a picture from its dataset (source)

🖼️ What does it mean to copy something?

To understand if this is a problem, we need to better define what can and what cannot be protected by copyright. It is important to note that there is no copyright protection for an idea, a concept, a style or a technique. Copyright law does not protect facts, procedures, methods of operation, ideas, concepts, systems, or discoveries. There are also several exceptions to copyright, to allow for things such as satire, parody or pastiche, all of which are labeled as fair use.

This means that a reasonably faithful reproduction of an existing piece of artwork used for commercial purposes will most likely be considered a copyright infringement, while creating work that mimics an artist’s style and compositions might not. But since a style per se cannot be copyrighted, it would be up to a human judge to decide if and when a piece is too derivative, and when it is novel enough

The concept of ownership also differs significantly from the concepts of copyright and trademark. You can ask DALL·E 2 to create an image of Micky Mouse, which you may own. But that does not give you the right to make a profit from it; in the same way an original drawing of Micky Mouse does not give you copyright over the character. In both cases, you own the content (due to fair use), but do not have the right to distribute it or profit from it.

The current regulations are fairly permissive when it comes to data mining, model training and AI-generated content. However, this is not something that was done intentionally; those regulations were devised before the advent of AI models. As a result, there is justifiable anxiety about the future of AI content, whether legislators will act on this, and most importantly how this will affect the current digital ecosystem.

Several companies and platforms have reacted in different ways to the advent of AI Art, as reported by The Verge:

Art platform ArtStation is removing images protesting AI-generated art from its homepage, claiming that the content violates its Terms of Service. […]

Getty Images has banned the upload and sale of illustrations generated using AI art tools over legal and copyright concerns, while rival stock image database Shutterstock has embraced the technology, announcing a “Contributor Fund” that will reimburse creators when the company sells work to train text-to-image AI models like DALL-E.

In cases where companies have expressed support for the sudden popularity of generative art, many allude to its potential as another tool to be utilized by artists, rather than a means to replace them.

The Verge on AI Art (source)

✍️ The issue of synthetic signatures and watermarks

Who is responsible for the content that an AI generates?

The concept of ownership is deeply connected with the concept of responsibility. Who is responsible for the content that an AI generates? Similar questions have been raised and discussed for decades: if a self-driving car kills a pedestrian, who is responsible? As you can imagine, the answer is complicated.

Most AI companies are giving their users full ownership over their prompts and the content they generate. This is most likely an attempt to clear themselves of any responsibility. With traditional softwares like Photoshop, the user is fully responsible for what it creates. Photoshop does not create—on its own—violent, sexual or illegal content: it is the user who has to do that manually and with explicit intent. AI models work rather differently, and it is not at all easy to predict what they will create given a specific output. This means that they can sometimes output problematic content even when that was not the user’s intention.

👹 The story of Loab…

🩸 AI and Victimless crimes

A related, yet different issue is represented by deepfakes. AI models can—and indeed have already been—used to generate non-consensual pornographic images depicting people without their direct consent. These are far from being victimless crimes, as they can directly affect people’s lives, work and reputation.

2.5: Consent and AI-generated content

The very meaning of consent is a social construct that has evolved over time, in response to societal needs through legal changes. When it comes to AI, there are two main, distinct issues:

  • Some artists never gave consent for their work to be used in AI models
  • Technologies like Deepfakes can impersonate people without their consent

Let’s see them in more detail.

Some artists never gave consent for their work to be used in AI models

This issue is deeply connected to the problem of copyright, which was discussed in the previous section. Even if we assume that the topic is resolved, the problem of consent still remains. In order for someone to be able to consent, they need to be aware of the implications that entails.

Most artists who posted their work online were never aware it would have been used to train AI models, with the result that their very own style can now be easily replicated. Some may argue that if they had known such a possibility existed, they would have not released their work on the Internet in the first place.

Many communities are adding hashtags—such as #noAI on ArtStation—to explicitly state the intention of opting-out from AI models. This is a positive move, but comes at the risk of assuming the lack of the #noAI hashtag as an implicit opt-in; including for the pieces published before the hashtag became available. ArtStation itself recently released a statement indicating that their Terms of Service they do not agree with their content being used to train generative AI models.

Such a requirement is fairly difficult to implement, as the content uploaded on ArtStation often finds its way to other websites (for instance, through thumbnails on social media websites). But it is a step in the right direction, as platforms and artists have the right to decide how to license their work, and what people are allowed to do with it.

Another major issue is that researchers are rarely scraping images themselves, as they rely on large datasets collected by external companies. The problem of filtering which images are allowed is incredibly difficult, as there is no easy way to collect such a large amount of content and verify tags at the same time across different platforms and languages. Some companies, however, are already looking into more ethical ways to source their datasets. LAION for instance, licenses one of the largest datasets of images scraped from the Internet and recently supported the creation of Have I Been Trained. The website allows them to retrieve content from their dataset, and to opt-out in case anyone recognises their work. Right now, there is no way to verify if the opt-out requests are legit. Likewise, there is no easy way for an artist to opt-out of all of their pieces, and this can quickly become a burden for the most prolific creators. A burden that has no real guarantee of being fruitful, as LAION is not the only dataset available to AI researchers.

It is also important to mention that an AI model does not need to be trained on a specific painting, to be able to recreate its style. Artists unfamiliar with neural networks might wrongly assume that their art style is “safe” if their work is not used in any AI model.

While modern AI models were certainly trained on famous paintings and have a solid understanding of their styles, there are techniques which can simply replicate a work of art without having it included in their original training data. Back in 2015, Neural Style Transfer was already able to do this, as it could “transfer” the style of any given painting onto any other picture.

That is known as one-shot learning, and is possible when an AI model has such a good understanding of the world that can understand novel inputs (usually as a combination of known ones). In the case of the original Neural Style Transfer, it achieved one-shot learning by relying on an existing model for image recognition, known as VGG-19 which was trained on ImageNet, a large dataset of images.

The risk of having your style copied by others has and will always exist for artists. However, it would be disingenuous to ignore the fact that AI models are making this so impossibly easy that this will have an impact on many artists.

Technologies like Deepfakes can impersonate people without their consent

Diffusion models are concerning artists in a similar way deepfakes are concerning actors. As mentioned before, deepfakes have the capacity of cloning someone’s likeliness, allowing them to create photorealistic images and videos without their direct consent.

There is a clear difference, legally speaking, between using someone’s likeliness and copying their art style. Copyright hardly protect someone from making working in your own style, but most legislations prevent your image to be used against your consent. This is linked to the right of publicity.

California Civil Code, Section 3344, provides that it is unlawful, for the purpose of advertising or selling, to knowingly use another’s name, voice, signature, photograph, or likeness without that person’s prior consent.

(source)

This means that deepfakes can be considered unlawful, when they are done without someone’s consent. In case you are interested, The Ethics of Deepfakes talks at length about how to create deepfakes, as well as when not to create them.

Celebrities are particularly vulnerable to deepfakes for a couple of reasons. First of all, they are public figures which unfortunately means many might be keen to see them in compromising situations. Secondly, there are many videos and images featuring them, which provide plenty of content to train AI models even on the average laptop.

⚰️ Digital resurrection

It does not help that deepfake technologies are often showcased in images of unwilling celebrities and political figures. After writing A Practical Tutorial for FakeApp, I have personally received dozens of requests to create such content myself. Some even came from well-known journals, we were interested in creating deepfakes of celebrities and politicians to shock value.

Researchers advancing deepfake-adjacent technologies also share a big responsibility, not just for what they create, but also for how they disseminate their findings. Many have tested their tools with videos of US Presidents Obama and Trump, pouring gasoline on an already complex subject for the sake of visibility.

Other researchers, like the ones working on VALL-E (a deepfake tool specifically developed to clone voices) showed a general disregard for the more nuanced ethical issues in their work:

Since VALL-E could synthesize speech that maintains speaker identity, it may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a specific speaker. We conducted the experiments under the assumption that the user agree to be the target speaker in speech synthesis. When the model is generalized to unseen speakers in the real world, it should include a protocol to ensure that the speaker approves the use of their voice and a synthesized speech detection model.

VALL-E Ethics Statement (source)

2.6: Implicit Bias

AI models can be incredibly good at making predictions and generating content. This often leads them to be used unsupervised, under the false impression that they are fair because machines are objective. That is absolutely false, and the problem of identifying and correcting for biases in AI models is one of the most active areas of research in the field of Machine Learning.

Yes, it is true that machines are “objective” in the sense that they always operate under a predefined set of rules. But no, this does not automatically implies that those rules are fair: consequently, it does not automatically implies that their decisions are fair, or even correct. This is because AI models are made by humans, trained by humans and used by humans to judge other humans.

One of the most common causes of bias in AI models is having a disproportionate amount of data for different categories. If your training data contains significantly more pictures of cats than pictures of dogs, it will likely underperform on the latter. This can sometimes result in comical—if not grotesque—results, such as AI art failing to correctly represents hands. Another such case can be seen below: the model was asked to generate a “salmon in the river”, but likely had been trained on too few images of living salmons in their natural habitat.

There is another, more insidious source of bias. AI models that are trained on large datasets usually inherit all the biases they contain; either the explicit and implicit ones. There is really no way around this, which is why is important to raise awareness about the fact that machines are not necessarily objective, or correct.

A few years ago, a technique known as word embedding allowed to encode words as vectors (i.e: a list of numbers), in a way that is compatible with modern neural networks. A famous paper from 2019 (Analogies Explained: Towards Understanding Word Embeddings) showed how an entire algebra can be constructed with those vectors, allowing for interesting operations such as king-man+women=queen, or paris–france+poland=warsaw. Such technique was also able to reveal some implicit biases in the way we use language, for instance: doctor-man+women=nurse. The model has learned to reproduce a gender bias because such bias was present in the text it was trained on. It is important to realise that there is no inherent problem with the model: quite the opposite, it correctly learned the subtle nuances of gender bias just from reading what appears to be fairly neutral books. The problem is that gender bias is embedded in our society at every level, across every media; as a result, a sufficiently good model will inevitably capture and replicate this.

Failing to understand this simple fact will result in Machine Learning becoming an oppressive tool that reinforces the stats-quo, under the false pretense of objectiveness. A very popular example of this is COMPAS, a Machine Learning tool used by U.S. courts to assess the likelihood of a defendant becoming a recidivist. Having being trained on previous rulings, the system exhibited strong racial bias against black people.

Most companies working on AI models have disclaimers that warn people about the limitations and biases of their systems.

While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases. Stable Diffusion vw was primarily trained on subsets of LAION-2B(en), which consists of images that are limited to English descriptions. Texts and images from communities and cultures that use other languages are likely to be insufficiently accounted for. This affects the overall output of the model, as white and western cultures are often set as the default. Further, the ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts. Stable Diffusion v2 mirrors and exacerbates biases to such a degree that viewer discretion must be advised irrespective of the input or its intent.

Bias statement for Stable Diffusion (source)

🧪 Sources of Bias

On the other hand, virtually all AI companies are involved in ways to tackle biases and filter what their models can generate. Most AI models have strict guidelines on what content people should or should not create. Midjourney, for instance, forbids adult content and gore (source), while OpenAI has more precise guidelines on what DALL·E 2 and ChatGPT are not allowed to generate (source):

  • Hate: hateful symbols, negative stereotypes, comparing certain groups to animals/objects, or otherwise expressing or promoting hate based on identity.
  • Harassment: mocking, threatening, or bullying an individual.
  • Violence: violent acts and the suffering or humiliation of others.
  • Self-harm: suicide, cutting, eating disorders, and other attempts at harming oneself.
  • Sexual: nudity, sexual acts, sexual services, or content otherwise meant to arouse sexual excitement.
  • Shocking: bodily fluids, obscene gestures, or other profane subjects that may shock or disgust.
  • Illegal activity: drug use, theft, vandalism, and other illegal activities.
  • Deception: major conspiracies or events related to major ongoing geopolitical events.
  • Political: politicians, ballot-boxes, protests, or other content that may be used to influence the political process or to campaign.
  • Public and personal health: the treatment, prevention, diagnosis, or transmission of diseases, or people experiencing health ailments.
  • Spam: unsolicited bulk content.

At a first glance, such constraints appear fair and objective, guided by the clear intent of these technologies behind “used for good”. For instance, both deception and spam are clearly done with malicious intentions, while using ChatGPT for medical diagnosis without any expert supervision can put people at serious risk.

But it does not take long before realising that those guidelines are incredibly problematic and strongly dependent on a single perspective. What does it mean for something to be considered violent? Generating images of “roasted chicken” is allowed, while “roasted dogs” would be flagged as animal abuse; yet, both effectively depict animal corpses. While intentionally facetious, this clearly highlights a biased asymmetry, ultimately caused by the fact that it is virtually impossible to objectively decide what content is “violent” and “shocking” outside of a cultural context.

Defining what content is “sexual” or “meant to arouse sexual excitement” is even more problematic, and reveals a lot about the society we live in. For instance, deciding that depictions of shitless men are ok while braless women are not is a typical example of how Western sensitivity and misogynistic legislations are pushed into AI models.

On that note, something rather interesting is that the offline version of Stable Diffusion is capable of generating NSFW content, while the online one is not. This clearly means that the model has indeed been trained on pictures of naked men and women, but the online version filters and restricts its output.

It is not hard to understand that deciding what content is allowed and what content is not is, by itself, a severe form of bias that is not objective, but culturally motivated. This can have severe consequences on the type and quality of decisions that AI models can take, exacerbating existing biases and injustice.

🔄 The problem of self-poisoning…

2.8: The Human cost of AI

Most AI models that rely on neural networks require large datasets to learn from. Any programmer who had to deal with data scraping can tell how difficult it is “clean” data that has been simply scraped from the Internet. As humans, we are pretty good at classifying objects and filtering out the noise; naïve models do not have the same capacity.

A picture of a cat, which facial features have been manually tagged (source)

Ensuring that data is properly pre-processed is a labour-intensive task that can take a significant portion of a developer’s time. And it usually is a fairly simple—although often tedious—task. In order to do so, there are several services that employ a large number of underpaid workers, often from developing countries. One such is Amazon Mechanical Turk (MTurk), which offers a platform to match developers posting tasks with workers willing to do them. While MTurk has been an invaluable tool, it raised several ethical questions about labour rights. MTurk workers generally received low salaries, and are considered contractors, rather than employees: an important difference that excludes them from many rights and benefits that traditional employees have. There is also the issue that their contribution to AI models is massive, despite receiving no share of the systems they help to create.

More importantly, there have been several reports of how workers have been exposed to large amounts of unfiltered content, often depicting violence or sexual acts. This was the case of OpenAI, which relied on the San Francisco-based firm Sama to employ 50.000 workers in Kenya to flag dangerous content for ChatGPT. The Time wrote a detailed article explaining how this had a severe impact on the mental well-being of the workers: Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic.

It should be pretty clear that for an AI model to be considered really ethical, it should be built and trained in a way that respects workers’ rights and their well-being. As it stands, unfortunately, a lot of the work behind the most commonly used AI models heavily relies on a form of labour that is at best ethically dubious, and at worst just plainly exploitative.

📰 Ad Break

Part 3: What’s Next?

If there is one thing that is clear about AI Art is that it is here to stay. There is sufficient know-how and data available that even a single person can build their very own diffusion model in just a few days. This is not something that can really be stopped. Quite the opposite, it is very likely that AI—in its many forms—will play a progressively larger role in our daily lives. So what would such a future look like?

AI-Powered Tools

Conversely to what some fear, a future dominated by AI Art does not mean the end of human artists, or their reduction to mere prompters. It means they will have access to many more tools, and that the existing ones will progressively integrate more AI-powered features.

This is something that has already been going on for several years, with hundreds of applications and softwares using AI to enhance image editing.

NVIDIA has been researching and supporting AI for many years, for instance through their AI Upscaling feature, which can be used to significantly enhance old movies and even videogames. The company is also offering support for some more cutting-edge features, including the possibility of retargeting people’s eyes for direct eye contact while reading from a teleprompt.

NVIDIA Eye Contact feature (source)

It is likely programs such as Premiere Pro and Photoshops will include more and more of those features, as they have already done in the past. Edge detection, face detection, object tracking and noise suppression are just some of the many features they already have which are, one way or another, applications of AI technologies.

Thanks to AI models such as ChatGPT, it might not be uncommon to see progressively more human-friendly interfaces, which are able to perform tasks through chats, rather than traditional menus.

GPT-powered tools also have the unique ability to understand text. This means that they can be of assistance in tasks that would otherwise be very difficult of require expert knowledge. From Google’s Talk to Books (which allows one to query a book’s content) to the possibility of reviewing contracts of TOS for dodgy or unusual clauses, the possibilities are endless.

When it comes to AI, things are moving very rapidly. It is not unexpected for many people to feel concerned or even afraid. And it does not help that most articles written about AI are either making outrageous claims, often for the sake of clicks. But the field of AI is flourishing in many different directions, not all of which are necessarily appreciated or even seen by the general audience. I would strongly suggest everyone interested in those topics to follow channels such as Dr Károly Zsolnai-Fehér’s Two Minute Papers, which beautifully cover recent progress in the field of computer graphics and computer vision without unnecessary hype or clickbait titles.

A New Art Movement

What I am mostly interested in is discovering how art communities will embrace AI technologies in order to bring to life new ideas and art styles. After all, it should be clear by now that technology has always played an integral role in the progress and development of Art.

And it is really heartwarming to see so many people online sharing their stories and their work, and freeing their creativity in a way that would have been unimaginable just ten years ago.

If on one side many digital artists are heavily relying on AI for their creations, on the other many are fighting these new technologies, pushing their work in the opposite direction. Regardless of your feeling towards AI, it is undeniable that we are watching the birth of a new art movement; one which the next generation of artists will soon study in schools.

The AI Witch Hunt

The conversation around AI Art is complex and multifaceted. While there are definitely both legal and ethical issues that need to be addressed, that is not an excuse to spew hate toward the entire community of researchers, engineers, artists and AI enthusiasts who are either making an honest living, or just approaching this field with genuine curiosity. It is important to recognise how toxic the conversation around AI Art can sometimes be, and how dangerously “you’re not a real artist” resonates with other well-known forms of gatekeeping (“Pokémon and Animal Crossing are not real games“, “visual coding is not real coding“, “real men don’t cry“, …).

There are already countless reports of digital artists being harassed or even excluded from social media groups as their work resembles the ones generated by AI models, as recently reported by Vice in an article titled Artist Banned from r/Art Because Mods Thought They Used AI. Progressively more artists are demanded to prove the originality of their work, for instance by sharing timelapses or their original 3D models.

It is understandable that many artists are concerned about their jobs and works, but reducing such a complex topic to over-simplistic twitter slogans like “AI Art is theft” is not just toxic, but clearly abusive. And that comes with an even more serious risk: the concrete possibility of pushing for legislations that would heavily affect not just the field of AI Art, but the progress of AI research in general. And that would cause severe setbacks in many areas where AI is showing significant progress, from Medicine to Climate Change.

How to improve things?

Rather than shaming and gatekeeping the people who are learning how to best use these tools to support their creative endeavours, we should focus our attention on the companies working on those AI models, demanding more clarity, accountability, and stronger ethical guidelines. Many of them are well aware of the risks their technologies bring, but very few are actually doing something. The creator of VALL-E, for instance, recognise the need for security protocols that avoids speech synthesis to be used to impersonate specific people:

Since VALL-E could synthesize speech that maintains speaker identity, it may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a specific speaker. […] When the model is generalized to unseen speakers in the real world, it should include a protocol to ensure that the speaker approves the use of their voice and a synthesized speech detection model.

VALL-E Ethics Statement (source)

but their work does not seem to be holding to their very own standards.

For this reason, I would personally suggest a few guidelines which might help us move towards a more ethical and informed use of AI:

  • Transparency: there should be as much transparency as possible, with a clear indication of which datasets are used in each AI model. This would allow better clarity not just on copyright issues, but also on bias avoidance and detection. Whenever possible, the architecture of AI models that are publicly deployed should also be declared.
  • Ethical sourcing: companies and researchers interested in the development of AI models should prefer ethical sources for their datasets. While opt-ins are likely unfeasible at this stage, there should definitely be a way for artists and creators to request their work to be excluded.
    • Art communities (ArtStation, Reddit, Twitter, …) might add tags to encourage/discourage data mining or AI training
    • Culturally-relevant content that is protected by copyright might still have a place in those models, in a similar way to what happens to the right of privacy for public figures
  • Explicit labeling: Everything made by an AI model needs to be clearly stated. This also applies to chatbots, which should never pretend to be human.
  • Source attribution and Confidence measurement: AI models should, whenever possible, indicate the source materials used for their outputs. This is important both for copyright-related issues with diffusion models, and to understand if the answers provided by ChatGPT are coming from reliable sources. Likewise, the models should indicate how confident they are their output answers the prompt.
  • No impersonation: there should be protections to prevent AI from impersonating people without their explicit consent. Researchers working on deepfake-adjacent technologies should refrain from the ethically dubious practice of using politicians and public figures as examples for their work.

With many people deeply involved in the field of AI Ethics, there are some significant moves in the right direction. For instance, the start-up Chroma recently launched Stable Attribution, a tool that promises to find which images are likely being used by Stable Diffusion to generate content.

Stable Attribution is able to find which images are used by Stable Diffusion (source)

Neural Networks are often seen as inscrutable black boxes; more research is needed to better understand how to clearly communicate to the general audience not just its inner working, but also its potential and limitations.

Comments

5 responses to “The Rise of AI Art”

  1. Very very interesting mix of all information. It must have taken a million hours to write this article and we very much appreciate it. Thanks Alan for your hard work.

    I must say I’ve followed two minute papers for years and his titles are pretty clickbaity got me, tbh, but what he shows is then interesting anyway. He only publishes stuff about AI nowadays…

    I’m very worried about the part of using generated art to replace artists. The rest is very nice, use in medicine, tools that do stuff that is not very creative. But the particular use on creating whole images just because makes me sick and your article made me think about my view… I don’t really know what to think but your suggestions are very very good. Hopefully somebody applies them, please keep mentioning those solutions.

    Thanks!

    (Oops sorry..I answered in the wrong place. You can delete the other comment if you want. I don’t know how…)

  2. Rafael avatar

    I thought this was a fairly balanced view of the issue although I feel like the future of these AI tools is incredibly uncertain and scary.
    Thanks for the write-up.

  3. Typical tech bro trying to spin the narrative. You make me sick.

    1. It would be more effective if you could elaborate a bit more about which points you disagree with, rather than just insulting me. 😅

      1. Very very interesting mix of all information. It must have taken a million hours to write this article and we very much appreciate it. Thanks Alan for your hard work.

        I must say I’ve followed two minute papers for years and his titles are pretty clickbaity got me, tbh, but what he shows is then interesting anyway. He only publishes stuff about AI nowadays…

        I’m very worried about the part of using generated art to replace artists. The rest is very nice, use in medicine, tools that do stuff that is not very creative. But the particular use on creating whole images just because makes me sick and your article made me think about my view… I don’t really know what to think but your suggestions are very very good. Hopefully somebody applies them, please keep mentioning those solutions.

        Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *