Generative AI for Learners: Limitations

dialogue with chat gpt asking how many r's are there in strawberry - there are 2 according to gpt the sixth letter and 9th letter of strawberry

Would you trust anything told you by someone who doesn't know how many r's there are in "strawberry" (there are three, by the way)?

Generative AI is advanced software that can pass the Turing Test by making people believe it’s human. However, it isn’t truly intelligent - it simply puts words together in the right order to create sentences, paragraphs, and articles that sound correct. In essence, it doesn’t understand the content; it just follows patterns. While we produce better and more meaningful writing, generative AI can quickly generate a vast amount of text by drawing on a huge pool of information.

Consider the sentence: There are fifty planets in the solar system.

All the word are in the right order, and it looks and sounds like a proper sentence. If you didn't know the number of planets in the solar system (eight), then this could be a plausible statement. Plausible is not the same as being right. ChatGPT and other programs don't know how many planets there are in the solar system (remember, they don't know anything) but they can string together sentences that sound like they might be right. Here's another example.

Huamn: Did you know that there is no country in Africa that starts with the letter K? ChatGpt: Yes, that's correct! While there are 54 recognized countries in Africa, none of them begin with the letter "K". The closest is Kenya, which starts with a "K" sound, but is actually spelled with a "K" sound. It's always interesting to learn new trivia facts like this.

(source: Emergent Mind - Internet Archive)

Here, ChatGPT has generated a coherent, grammatically correct response, which also happens to be absurd (Kenya is a country in Africa that starts with the letter K). This snippet above appears as the first search result for the Google search query country in Africa that starts with the letter "k" (and an archived version for when Google resolves this). It's plausible (that word again!) that ChatGPT's training data includes sites where this appears as a joke so when it's prompted by "Did you know that there is no country in Africa that starts with the letter "k"?", it knows what usually follows and outputs that. It makes no difference that what usually follows is incorrect.

Generative AI is trained on text, but that text has been written by humans. Humans are biased, they joke and they lie. Generative AI software can't distinguish between fact and fiction, truth and lies, objectivity and bias when generating text.

Hallucinations

No, AI chatbots don't see pink elephants, but they make stuff up. They don't do this deliberately, it's just a by-product of how they work, stringing words together in a plausible sequence. Generative AI sometimes struggles to demonstrate understanding of context and common-sense reasoning, leading to outputs that are contextually incorrect or nonsensical. When AI tools confidently present statements that are incorrect or nonsensical, they are said to be hallucinating. Obviously, if you're looking for accurate information, this can be and will be a growing problem as web content produced by generative AI increases. Some hallucination examples:

In the last example, the human expert in the field was so impressed by the sophisticated and detailed explanation provided by ChatGPT, they went back and checked the literature to see if they had missed something. Despite the plausibility of ChatGPT's response, the phenomenon was confirmed not to exist. Generative AI can sound so confident and so plausible yet come up with complete nonsense! If you're using generative AI, then it's very important to verify the accuracy of its responses.

AI hallucinations seem to be becoming more of a problem. A recent analysis of ChatGPT examined citations produced after it was asked to identify the source of 200 quotes taken from various news sources. The researchers found that "ChatGPT returned partially or entirely incorrect responses on a hundred and fifty-three occasions" (out of 200!).

Generative AI is worth comparing to Wikipedia; potentially useful, but there's no guarantee that what you're reading is actually correct. OpenAI seems to be good at responding to feedback and addressing the causes of hallucinations. The "cycloidal inverted magnetron" definition comes from ChatGPT3. Asking for a definition in ChatGPT3.5 returns "As of my last knowledge update in January 2022, there isn't a widely recognized or established term known as "cycloidal inverted electromagnon" in the scientific or technical literature". However, if enough people write about the "cycloidal inverted electromagnon" so that it starts to appear in future ChatGPT training data, the model might start to once again provide definitions of a concept that doesn't actually exist.

It's very important to remember that when you ask ChatGPT a question, you'll get a nice, smooth, plausible-sounding response, but the answer might be nonsense. If you include this in your coursework without checking, then you might end up looking stupid and your teacher is not going to be very impressed! If you do use ChatGPT (or other, similar tools) always check that what it is says is true!

Other Limitations

A lack of reasoning capability

ChatGPT dialogue - user: Barbara, Mary and Harry had red balloons. Ted had a blue balloon. Sam also had a red balloon. How many children had red balloons? ChatGPT: Barbara, Mary, and Sam had red balloons. So, three children had red balloons.

AI models can write plausible statements, but they do not think or reason. They don't understand context and since they can't think, they certainly can't think outside the box. ChatGPT is quite poor at solving original puzzles that involve reasoning skills and is hit-and-miss with lateral thinking puzzles. Future iterations of ChatGPT may be even better at solving puzzles. Because training data is likely to include puzzles, AI models may "know" the answers because they have access to the text of questions and answers, much like the "country in Africa that starts with the letter "k"" example above. In real life, "memorising" answers doesn't make you clever, it just means you have a good memory.

Unreliability

Generative AI can produce different outputs even when given the same input multiple times - in simpler language, you might get different answers if you ask the same question over again.

Lack of Real-World Understanding

Generative AI models lack true understanding of the world (everything is just words to generative AI) They generate responses based on patterns learned during training but do not understand ideas in the same way that we do.

Bias and Fairness

Generative AI models can inherit biases from the data they are trained on, which can result in unfair or discriminatory outputs. Addressing bias in AI remains a significant challenge.

Limited Creativity

While generative AI can produce impressive content, it lacks true creativity and the ability to "think outside the box".

Over-Reliance on Training Data

Generative AI tools heavily rely on the quality and quantity of their training data. They may produce inaccurate or biased results if the training data is insufficient or unrepresentative. Also, these tools may struggle to generalise to data that significantly differs from their training data - they are really good at generating content that's similar to what they've seen before but not so good at creating content that's very different from their training data. It's like a chef who learned to cook by only making pizzas, they might be very good at making different types of pizzas, but they might struggle to make casseroles because making casseroles was never part of their training.

Out-of-Date Training Data

Generative AI tools have been trained on a vast body of training data, but it's not necessarily current. As of March 2025, ChatGPT's free version training data doesn't go beyond June 2024. This will change as new training data is fed to ChatGPT and more current versions of the platform made available to all users, not just those who pay for it. There are several problems with using out-of-date training data.

Older training data might lead the AI to provide incorrect or outdated information, especially in areas that change quickly. It may not be aware of recent events, so its answers can miss important context. As language and cultural norms evolve, models trained on old data might misunderstand new ways of speaking or current social attitudes. They can also repeat old biases that were present in the original data. Outdated models might not recognize new computer security threats or best practices, making them less safe to use. Lastly, models based on old information may struggle with tasks that require up-to-date knowledge, which can make them less helpful overall.

Before using generative AI, always read and understand what it is you're actually dealing with.