Missing Fingers, the Telephone Game, and Lying LLMs

First, a clarification: LLMs don’t lie. Lying implies intent. LLMs hallucinate, but this too is a bit of a misnomer. The end result, however, is the same: LLMs may produce inaccurate output, sometimes in ways that are hilariously obvious, and sometimes in ways that are subtle and potentially dangerous. In this article we’ll explore what hallucinations are, why they occur, and how to minimize them.

In order to understand hallucinations, we must first understand how LLMs learn and later generate text. During training, LLMs digest a vast amount of textual data. The LLM learns the relationships in the language — not just between words, but context and even concepts. This is the true power of transformer-based AI. These relationships are used to find patterns in user input and generate sensical-sounding responses based on probability, not true understanding in the human sense.

While this understanding gives LLMs a lot of power for certain language tasks, such as summarizing text, categorizing documents, translating between languages, or more advanced tasks like sentiment analysis used in marketing, they fail at being able to return source data in its original form. The exactness of that data is lost in favor of generalizations. In other words, LLMs do not store rote, curated, verbatim knowledge, but rather concepts.

It’s probably obvious that an LLM trained on such a vast amount of data might make mistakes because of mistakes in the data itself, especially as much of the training data in many of today’s models comes from unverified sources like social media, chat transcripts, and works of fiction. However, even if an LLM could be trained on 100% verified, correct information, it would still make mistakes, and that’s the focus here.

The Telephone Game Analogy

Let’s consider the game of Telephone. In this exercise, we have a large number of people stand in line. The first in line is given a paragraph of text and must whisper it to the next person, who must whisper it to the next, and so on. The last person then recites the paragraph out loud and the results are compared with the original text. More often than not, changes occur in the process. Discounting intentional changes, we still often see subtle changes based on our hearing, understanding, and clarity for the next person.

While not a perfect parallel, LLMs learn by generalizing language in multiple passes, using chunks or batches of information at a time. During this process, the verbatim source text is not stored, and during generation some information is expectedly lost. Later, when a user queries an LLM, the LLM will answer not based on stored language, but learned concepts. This is where the parallel differs: The text generated will be based on the LLM’s interpretation of concepts learned during training instead of attempting to repeat data it’s seen verbatim.

The Missing Fingers Problem

To see how hallucinations lead to errors with this technique, let’s take a look at generative image AI. One of the biggest complaints of image models like DALL-E, Midjourney, and Stable Diffusion is the tendency for these models to generate the wrong number of fingers. This is a visual demonstration of hallucination, which is more accurately an artifact of an AI’s failure to properly generalize information. In essence, these images are generated on concepts as the model itself no longer has access to the original and unaltered source data.

If you’ve experimented with image AI, you’ve probably noticed glaring errors, like extra limbs. You might have missed less obvious errors like missing fingers. What people don’t always see are much more subtle errors, like inconsistencies in facial geometry. The same goes for text output from LLMs: Some errors may be obvious, but some more subtle and nuanced. These are all examples of hallucination — inaccurate output from the way AI generates text based on learned concepts instead of truly understanding knowledge in a human sense.

Managing Hallucinations

Hallucinations are a natural part of generative AI and can’t be avoided, but they can be managed for some tasks:

Validate everything: Most importantly, you must understand that no output from generative AI can be completely trusted without validation. Make sure you truly understand and have verified all output.
Provide quality reference data: The volume and accuracy of information a model has access to directly affects the quality of output. Rather than rely only on its training data, you should also provide your own validated data. Data in prompts is more heavily used than the model’s own training data and will tip the probability of accurate output in your favor.
Create proper workflows: Use AI output as a starting point for projects. Creating a workflow where you obtain your own data, explore that data with AI, and then validate output after through further research. You can then circle back to AI to help proofread and summarize your work, but you must remain responsible for accuracy.

In short, the shortcomings of LLMs come from their greater strengths in how they can generalize knowledge and concepts. Understanding both sides will ensure AI remains a useful, effective, and most of all safe tool.

AightBits