Generative AI software is complex and sophisticated. Programs in use have passed the Turing Test. However, they're not intelligent but they can fool people into thinking they are. Generative AI doesn't "know" anything other than what words are needed to follow each other to assemble coherent and grammatically correct sentences, paragraphs and articles. That's pretty much it; you and I can do that and do it better than generative AI. Generative AI wins out in the breadth of material it can draw upon, and the speed at which it can generate new content. Consider the following sentence:
There are fifty planets in the solar system.
All the word are in the right order and it looks and sounds like a proper sentence. If you didn't know the number of planets in the solar system (eight), then this could be a plausible statement. Plausible is not the same as being right. ChatGPT and other programs don't know how many planets there are in the solar system (remember, they don't know anything) but they can string together plausible-sounding sentences. Here's another example.
(source: Emergent Mind - Internet Archive)
Here, ChatGPT has generated a coherent, grammatically correct response, which also happens to be absurd. This snippet above appears as the first search result for the Google search query country in Africa that starts with the letter "k" (and an archived version for when Google resolves this). A reasonable hypothesis is that ChatGPT's training data includes sites where this appears as a joke so when it's prompted by "Did you know that there is no country in Africa that starts with the letter "k"?", it knows what usually follows and outputs that. It makes no difference that what usually follows is incorrect.
Generative AI is trained on text, but that text has been written by humans. Humans are biased, they joke and they lie. Generative AI software can't (yet!) distinguish between fact and fiction, truth and lies, objectivity and bias when generating text. The old computer science acronym GIGO - "garbage in, garbage out" often applies.
No, AI chatbots don't see pink elephants, but they make stuff up. They don't do this deliberately, it's just a by-product of how they work, stringing words together in a plausible sequence. Generative AI sometimes struggles to demonstrate understanding of context and common-sense reasoning, leading to outputs that are contextually incorrect or nonsensical. When AI tools confidently presents statements that are incorrect or nonsensical, they are said to be hallucinating. Obviously, if you're looking for accurate information this can be and will be a growing problem as web content produced by generative AI increases. Some hallucination examples:
In the last example, the human expert in the field was so impressed by the sophisticated and detailed explanation provided by ChatGPT, they went back and checked the literature to see if they had missed something. Despite the plausibility of ChatGPT's response, the phenomenon was confirmed not to exist. Generative AI can sound so confident and so plausible yet come up with complete nonsense! If you're using generative AI, then it's very important to verify the accuracy of its responses.
AI hallucinations are probably not that common and it's the case that sometimes people must phrase their conversational prompts just so to generate the hallucination, but generative AI is worth comparing to Wikipedia; potentially useful, but there's no guarantee that what you're reading is actually correct. OpenAI seem to be good at responding to feedback and addressing the causes of hallucinations. The "cycloidal inverted magnetron" definition comes from ChatGPT3. Asking for a definition in ChatGPT3.5 returns "As of my last knowledge update in January 2022, there isn't a widely recognized or established term known as "cycloidal inverted electromagnetron" in the scientific or technical literature". However if enough people write about the "cycloidal inverted electromagnetron" so that it starts to appear in future ChatGPT training data, the model might start to once again provide definitions of a concept that doesn't actually exist.
In November 2022, Meta (formerly Facebook) launched a public demo of their own generative AI chatbot, Galactica, which was trained on 48 million scientific articles, websites, textbooks, lecture notes, and encyclopaedias. Meta promoted Galactica as a tool for researchers, teachers and students to, among other things, summarise and write academic papers and generate Wikipedia articles, Galactica was withdrawn a mere three days after its launch after generating biased and incorrect output - the hallucination mentioned above, the research article on adding crushed glass to farm animals, was created by Galactica.
Meta "chose to remove the demo from public availability" after constant ridicule and criticism. The fallout from this is quite interesting. At the time, Michael Black, the Director of the Max Planck Institute for Intelligent Systems tweeted "This could usher in an era of deep scientific fakes" while Galactica was described by a senior and well-respected software developer as "little more than statistical nonsense at scale. Amusing. Dangerous. And IMHO unethical." On the other hand, one of Galactica's creators asserted that "the anti-Galactica commentary was really stupid", while blaming users for treating the Galactica demo "like a product...We definitely did not view it as a product! It was a base model demo.".
While some effort was needed to have Galactica generate garbage output and understandably its developers were frustrated by this abuse, it's worrying when AI developers blame users for the poor performance of their model. If users can't use Galactica properly, why release it? Why develop something that is so easily abused? Attacking users for pointing out the potentially harmful nature of an AI model does not inspire confidence in either the model or the company behind it.
Ironically, earlier in 2022, Meta disbanded its "Responsible Innovation Team". This was a group intended to address "potential harms to society" created by Meta's products. Perhaps they might have urged a little more caution with Galactica.
AI models can write plausible statements, but they do not think or reason. They don't understand context and since they can't think, they certainly can't think outside the box. ChatGPT3/3.5 is quite poor at solving original puzzles that involve reasoning skills and is hit and miss with lateral thinking puzzles. However, 3.5 is better than 3 and ChatGPT4 (subscription only at time of writing) is better than both. Future iterations of ChatGPT may be even better at solving puzzles. Because training data is likely to include puzzles, AI models may "know" the answers because they have access to the text of questions and answers, much like the "country in Africa that starts with the letter "k"" example above. In real life "memorising" answers doesn't make you clever, it just means you have a good memory.
Generative AI models lack a true understanding of the world. They generate responses based on patterns learned during training but do not comprehend concepts in the same way that humans do.
Generative AI models can inherit biases from the data they are trained on, which can result in unfair or discriminatory outputs. Addressing bias in AI remains a significant challenge.
While generative AI can produce impressive content, it lacks true creativity and the ability to think outside predefined patterns
Generative AI tools heavily rely on the quality and quantity of their training data. They may produce inaccurate or biased results if the training data is insufficient or unrepresentative. Also, these tools may struggle to generalise to data that significantly differs from their training data - they are really good at generating content that's similar to what they've seen before but not so good at creating content that's very different from their training data. It's like a chef who learned to cook by only making pizzas, you might be very good at making different types of pizzas, but you might struggle to make casseroles because making casseroles was never part of your training.
Generative AI tools have been trained on a vast corpus of training data, but it's not necessarily current. As of January 2024, ChatGPT's free version training data doesn't go beyond January 2022. This will change as new training data is fed to ChatGPT and more current versions of the platform made available to all user, not just those who pay for it. There are several problems with using out-of-date training data:
Before using generative AI, always read and understand what it is you're actually dealing with.