If you use Twitter, you may be familiar with this tweet:
Because of the ethical issues surrounding the creation, training and use of generative AI, many people see it a as a real life Torment Nexus. Here follows a list of some of the ethical issues surrounding generative AI. These ethical issues don't make generative AI bad for society, but it is useful to be aware of the context of the development and use of generative AI models.
Generative AI tools are often "black boxes" - complex systems whose internal workings are hidden from sight or not readily understood. It's difficult to engender trust in a system when you don't know how it works or when it's being used (e.g website chat tools - are you talking to a human or a bot?). How can you be confident of a decision-making process when you don't know how decisions are made?
A big issue in information technology, which will perpetuate into generative AI is algorithmic bias, defined by Wikipedia as "systematic and repeatable errors in a computer system that create "unfair" outcomes, such as "privileging" one category over another in ways different from the intended function of the algorithm". Many books have been written about how algorithmic bias is baked in by lack of minority groups being involved in the creation and testing of algorithms. Many tech products have been noticed to not recognise black skin for example.. Other examples:
To mitigate algorithmic bias, AI models need diverse training datasets. However, proprietary AI models do not have their source code or training data publicly available, so it's impossible to know what the models are being trained on or how the algorithms work.
There are three aspects to this:
A lot of data used to train generative AI tools comes from the open Internet. Have you ever given personal data to a website or created website content? That personal information and the content you created may have been/is being/may be in the future used to train or generate AI. Did you consent to this?
How do you feel about your data being used in ways you didn't anticipate? Even if you did technically consent to it through vague wording in the terms and conditions that you never read but ticked the agreement box anyway, It still might feel wrong.
Additionally, data shared during prompts and conversations may be personally sensitive as well (like with search engine queries and social media use). A learner asking for resources around gender and sexuality or abortion would wish to keep these details confidential. Alas, because of a bug, ChatGPT user chat history was leaked in early 2023 and it is not infeasible to think that data breaches could happen in the future
Essentially, any interaction you have with an AI tool is likely to be used to further train the AI and be used as the company sees fit. Open AI, for example will use personal information to better train ChatGPT, although they do say they "do not and will not use any personal information in training information to build profiles about people, to contact them, to advertise to them, to try to sell them anything, or to sell the information itself." One day they may change their mind, or there may be other AI developers who are more laissez faire about data use.
Generative AI tools can only be as good as the data they are trained on. They can't differentiate fact from fiction and they don't recognise bias. In 2016 Microsoft released an AI Twitter chatbot named Tay and withdrew the service after a mere 16 hours because malicious users "trained" Tay to tweet racist, misogynist and other hateful remarks. Generative AI tools have mostly been created by English speakers and will be trained with mostly English language data. The tools will work best in English and reflect a Western point of view. Research has found that AI detection tools are biased against non-native English speakers, often wrongly flagging their work as being generated by AI.
Generative AI can amplify bias and negative stereotyping and pass these on to its users or reinforce a user's existing strong opinions on contentious issues
Companies working in Generative AI have, like other tech companies, adopted a "move fast and break things" approach to product development, valuing speed and experimentation over caution and analysis. One of the things that has been broken in generative AI development is copyright. Generative AI tools have been been trained on copyrighted material and there are author lawsuits pending against generative AI developers. AI companies might argue "fair use" but copyright holders will disagree. OpenAI, the creators of ChatGPT have stated that "it would be impossible to train today’s leading AI models without using copyrighted materials". There is some merit in this argument in that often the best quality information is hidden behind paywalls while what is freely available on the web is poor quality. There is no doubt that the Chat GPT's output would benefit from using better quality data, but the nature of copyright will dictate that the company must come to some kind of licencing arrangement with copyright holders. There's also another argument that people have made; is generative AI of sufficient benefit to everyone to justify its existence (and therefore its use of copyrighted material)?
This section has discussed text-based generative AI but there are generative AI tools that specialise in creating images from prompts, the best known perhaps being Stable Diffusion, MidJourney and DALL-E. These have been trained on images, nearly all of which have created by human artists, illustrators and photographers. Again, this is regarded by many to be copyright infringement and there are lawsuits pending against the companies that have developed them.
Copyright is a hot topic in discussion of generative AI; many people feel it is wrong for companies to profit from using the copyrighted work of human creators. In other contexts (games, films, music), wouldn't this be called piracy?
That's the input, what about the output? Who holds copyright over generative AI-created content? A good question and the answer currently seems to be "it depends". In March 2023, The US Copyright Office issued guidelines indicating that whether AI-generated content was copyrightable would depend upon the amount of human input. Because generative AI produces "complex written, visual, or musical works," the "'traditional elements of authorship' are determined and executed by the technology,". As yet, material produce by machines cannot be copyrighted and so this generated content cannot be copyrighted. However, according to the Office, a human "may select or arrange AI-generated material in a sufficiently creative way" so as to create an original work, which can then be copyrighted by the human. What qualifies as "sufficiently creative" will keep lawyers' pockets well-lined.
Generative AI tools can provide an unfair advantage to learners who use them to complete assignments. Using these tools a negation of learners agency; they are not in control of the essay writing process; having ceded that to a machine. Instead of using *someone's* work, learners are using *something's* work. Either way it's not their own work.
The creator of ChatGPT, OpenAI, recognises the unreliability of the tool stating "While tools like ChatGPT can often generate answers that sound reasonable they can not be relied upon to be accurate consistently or across every domain.". This statement was taken from a page on Open AI website which was deleted sometime after June 2023. Thus, for the learner, using Generative AI may be no more useful than plagiarising from the web.
It's also possible, under certain conditions to induce ChatGPT to output training data verbatim. It may be possible for ChatGPT to do this in response to ordinary prompting. Including this - inadvertently - in learner work would also qualify as plagiarism.
Generative AI systems need high-end computers and servers and require a significant amount of electricity to run. Training these systems uses computational resources and therefore power. Google’s CO2 emissions have increased 48% over the last five years due to heavy emphasis on embedding AI in its products. In the global warming era, is this a necessary or worthwhile use of energy? Water is used to generate electricity and cool servers. ChatGPT uses a 500 ml bottle of water in answering 20-50 questions. Is this a worthwhile use of water?
AI-Generated content is being used on fake news sites, content farms and product reviews. Generative AI can contribute to the spread of disinformation and fake news. AI-generated news articles have the potential to influence public opinion and can be targeted to manipulate voters in an election. Similarly, AI-generated fake news stories can be used to attack the reputation of individuals or organisations, or to disseminate false information. Deepfakes are AI-generated videos or images that can be used for malicious purposes - such as providing video evidence supporting disinformation.
Generative AI is a very powerful tool. But currently, companies developing them are free to do as they like in the absence of any regulatory measures. Regulation would help address ethical issues such as transparency, bias, data governance and privacy. In December 2023, EU officials agreed the content of the content of an AI regulatory law, adopted in 2024 and taking effect in 2025. Other jurisdictions (e.g. the UK, the US and China) are looking to introduce their own regulations, but perhaps generative AI needs a global regulatory response
Automation was long thought to potentially release us from drudgery to be more creative but the opposite seems to be be happening with generative AI. which has the potential to replace human creators. MSN (Microsoft's Internet Portal) G/O Media (online media platform), CNET (technology and consumer electronics website) Bild (German newspaper), Duolingo (a popular language learning app) and other companies have replaced human writers with generative AI tools, even though the AI-generated content might be quite egregious.
AI companies have used low-paid workers in developing countries to help train AI tools to recognise and avoid toxic content, including hate speech and violence. Repeated exposure has been traumatising to the workers who can be paid as little as $2 per hour
Additionally, the use of AI tools in hiring processes is another application that raises concerns regarding the lack of transparency shown, and biases introduced, by these algorithms.
Many of these tools are free but as the tools become more sophisticated, the creators of generative AI tools will look to monetise them. ChatGPT is currently free, but the more advanced (and with more modern training data) ChatGPT Plus requires a subscription. Many of the more specialised tools now appearing that are based on ChatGPT require payment too. Assuming AI does work out to be beneficial to learners, then there is a possibility of AI becoming part of the digital divide: some learners being able to afford access to AI tools but others unable.
As well as using generative AI to replace human workers, generative AI is being used to manipulate search engines to improve website rankings in their results.. This Twitter thread describes a particularly egregious use of AI to gain page views at the expense of another site. The procedure is as follows:
This has reduced the number of page views of the competitor website, Exceljet, a high quality Excel help site, It's probably also reducing the quality of information available to people who search the web for assistance with using Excel. The new site's articles contain factual errors, including references to Excel features and commands that simply do not exist. It is very likely that this technique will be used by many other SEO specialists to improve the ranking of their website in search engine results. Using generative AI for SEO purposes will fuel fears of eventual model collapse.
In addition to reducing the quality and utility of the web as an information source, the use of poor quality AI-generated content may lead to what is described as "model collapse" AI models are trained on data scraped from the web. As the amount of substandard AI-generated web content increases, this will reduce the quality and diversity of future AI model outputs as they will have be trained on more and more AI-created content, affecting in turn future AI models.
A way of mitigating this might to have AI models trained on copyrighted data that otherwise might be hidden behind paywalls. This will require licencing agreements with copyright holders and that will cost money.
Not an ethical issue in of itself, but a grave cause for concern is the fact that since generative AI took off, several tech companies have laid off or reduced the size of their AI ethics teams. In December 2020, Google terminated the employment of Timnit Gebru, a a widely respected leader in AI ethics research and the co-lead of Google's AI team, after she co-authored a paper detailing some of the risks involved in using large language models (LLMs). This behaviour by big tech companies is not conducive to addressing ongoing AI ethical issues.
Image sources:
Torment Nexus: Alec Blechman on Twitter
I wanr AI to do my laundry @AuthorJMac on Twtter