Generative AI Models Are Built to Hallucinate: The Question is How to Control Them

From industry and academic conferences, to media reports and industry chatter, a debate is growing around how to avoid or prevent hallucinations in artificial intelligence (AI).

Simply put, generative AI models are designed and trained to hallucinate, so hallucinations are a common product of any generative model. However, instead of preventing generative AI models from hallucinating, we should be designing AI systems that can control them. Hallucinations are indeed a problem – a big problem – but one that an AI system, that includes a generative model as a component, can control.

Building a better performing AI system

Before setting out to design a better performing AI system, we need a few definitions. A generative AI model (or simply “generative model”) is a mathematical abstraction, implemented by a computational procedure, that can synthesize data that resembles the general statistical properties of a training (data) set without replicating any data within. A model that just replicates (or “overfits”) the training data is worthless as a generative model. The job of a generative model is to generate data that is realistic or distributionally equivalent to the training data, yet different from actual data used for training. Generative models such as Large Language Models (LLMs) and their conversational interfaces (AI bots) are just one component of a more complex AI system.

Hallucination, on the other hand, is the term used to describe synthetic data that is different from, or factually inconsistent with, actual data, yet realistic. In other words, hallucinations are the product of generative models. Hallucinations are plausible, but they do not know about “facts” or “truth” even if it was present in the training data.

For business leaders, it’s important to remember that generative models shouldn’t be treated as a source of truth or factual knowledge. They surely can answer some questions correctly, but this is not what they are designed and trained for. It would be like using a racehorse to haul cargo: it’s possible, but not its intended purpose.

Generative AI models are typically used as one component of a more complex AI system that includes data processing, orchestration, access to databases, or knowledge graphs (known as retrieval-augmented generation, or RAG), and more. While generative AI models hallucinate, AI systems can be designed to detect and mitigate the effects of hallucination when undesired (e.g., in retrieval or search), and enable them when desired (e.g., creative content generation).

From research to the hands of customers

LLMs are a form of a stochastic dynamical system, for which the notion of controllability has been studied for decades in the field of control and dynamical systems.

As we have shown in a recent preprint paper, LLMs are controllable. That means that an adversary could take control, but that also means that a properly designed AI system can manage hallucination and maintain safe operation. In 1996, I was the first doctoral student to be awarded a PhD from the newly formed Department of Control and Dynamical Systems at the California Institute of Technology. Little did I know that, more than a quarter century later, I would be using these concepts in the context of chatbots trained by reading text on the web.

At Amazon, I sit alongside thousands of AI, machine learning, and data science experts working on cutting-edge ways to apply science to real world problem at scale. Now that we know that AI bots can be controlled, our focus at Amazon is to design systems that control them. Services like Amazon Kendra, which reduces hallucination issues by augmenting LLMs to provide accurate and verifiable information to the end-user, is an example of our innovation at work – bringing together research and product teams who are working to rapidly put this technology into the hands of customers.

Considerations to control hallucinations

Once you’ve established the need to control hallucinations, there are a few key things to keep in mind:

Exposure is beneficial: Generative AI models need to be trained with the broadest possible exposure to data, including data you do not want the generative AI system to hallucinate.
Don’t leave humans behind: AI systems need to be designed with specific training and instructions from humans on what are acceptable behaviors, concepts, or functions. Depending on the application, some systems must follow certain protocols or abide to a certain style and level of formality and factuality. An AI system needs to be designed and trained to ensure that the generative AI models stays compliant to the style and function for which it is designed.
Stay vigilant: Adversarial users (including intentional adversaries, as in “red-teams”) will try to trick or hijack the system to behave in ways that are not compliant with its desired behavior. To mitigate risk, constant vigilance is necessary when designing AI systems. Processes must be put in place to constantly monitor and challenge the models, with rapid correction where deviations from desired behavior arise.

Today, we’re still at the early stages of generative AI. Hallucinations can be a polarizing topic, but recognizing that they can be controlled is an important step toward better utilizing the technology poised to change our world.

About the Author

Stefano Soatto is a Professor of Computer Science at the University of California, Los Angeles and a Vice President at Amazon Web Services, where he leads the AI Labs. He received his Ph.D. in Control and Dynamical Systems from the California Institute of Technology in 1996. Prior to joining UCLA he was Associate Professor of Biomedical and Electrical Engineering at Washington University in St. Louis, Assistant Professor of Mathematics at the University of Udine, and Postdoctoral Scholar in Applied Science at Harvard University. Before discovering the joy of engineering at the University of Padova under the guidance of Giorgio Picci, Soatto studied classics, participated in the Certamen Ciceronianum, co-founded the Jazz Fusion quintet Primigenia, skied competitively and rowed single-scull for the Italian National Rowing Team. Many broken bones later, he now considers a daily run around the block an achievement.

At Amazon, Soatto is now responsible for the research and development leading to products such as Amazon Kendra (search), Amazon Lex (conversational bots), Amazon Personalize (recommendation), Amazon Textract (document analysis), Amazon Rekognition (computer vision), Amazon Transcribe (speech recognition), Amazon Forecast (time series), Amazon CodeWhisperer (code generation), and most recently Amazon Bedrock (Foundational Models as a service) and Titan (GenAI). Prior to joining AWS, he was Senior Advisor of NuTonomy, the first to launch an autonomous taxi service in Singapore (now Motional), and a consultant for Qualcomm since the inception of its AR/VR efforts. In 2004-5, He co-led the UCLA/Golem Team in the second DARPA Grand Challenge (with Emilio Frazzoli and Amnon Shashua).

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Generative AI Models Are Built to Hallucinate: The Question is How to Control Them

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Speak Your Mind Cancel reply

Featured RSS Feed

More News from insideHPC

Generative AI Models Are Built to Hallucinate: The Question is How to Control Them

Sponsored Guest Articles

Optimizing Performance and Cost Savings for Elastic on Pure Storage

White Papers

From complexity to clarity: Harnessing the power of AI/ML and risk-informed strategies to streamline clinical data management

Join Us On Social Media

Speak Your Mind Cancel reply

Related Posts

Featured RSS Feed

More News from insideHPC