Hallucinations in Generative AI Models

Ivan Sysoev

Author

•

min read

September 12, 2023

Share this article on:

Hallucinations in Generative AI Models

Creating an "Absolute Truth Machine" remains an elusive goal in AI, and mathematics confirms the inevitability of hallucinations of generative models.

‍

Tracing the Roots: Generative AI Models

Today's Large Language Models (LLMs), which many in the tech community recognize, are a product of a groundbreaking discovery termed "Transformer". This innovation was introduced to the world in 2017 through the paper aptly titled "Attention Is All You Need". However, the journey of generative AI commenced much earlier, rooted in the advancements of discriminative models used in tasks like classification and natural language processing. A pivotal stride in this journey came in 2014 when Ian Goodfellow unveiled the concept of Generative Adversarial Nets. Here, the generator strives to convert random noise into plausible outputs, while the discriminator's role is to weed out less credible examples, receiving rewards for its accuracy. To truly grasp their mechanics, one must delve into the mathematical foundations of these models. These mathematical principles provide a conduit from real-world data to a different dimension — the multi-dimensional realm termed the latent space.

‍

Deciphering the Math Behind AI Hallucinations

Despite the massive scale of modern generative models (take GPT-3 as an example, which demands 800GB for parameter storage), their data generation capacity seems practically limitless. Interestingly, the parameter set's size appears diminutive when juxtaposed with the colossal 45 TB of text data GPT-3 was trained upon. This disparity raises a pertinent question: How does an AI model compress insights from 45TB of text into a mere 800GB, achieving an astounding 98% compression? The secret lies in capturing relationships between distinct concept-vectors within latent space encodings. By retaining just the transformation parameters from real data to latent space and vice versa, we essentially have a roadmap to traverse this transitional realm. Every AI model, in essence, represents the pinnacle of approximation achievable through iterative training processes. This makes the latent space a marvel of precision, yet it remains confined by mathematical boundaries.

Embedded within the latent space are the constraints sculpted by the model's architecture and design. The pre-defined dimensionality of this space sets a limit to its complexity-handling capacity. Additionally, the model's nonlinear operations, indispensable for deciphering complex patterns, bring their own set of approximations, rendering complete data reconstruction a challenging feat. Activation functions, such as the sigmoid or ReLU, mold the information, ensuring it adheres to certain parameters. The transition from data to the latent space and its return is choreographed through matrix operations, biases, and activation functions.

Each neural network layer contributes through its linear mapping, which is then enriched by activations' nonlinear nuances. This intricate interplay allows models to emulate the multifaceted relationships present in real-world data. Training methodologies like backpropagation and gradient descent continually refine these mappings, aiming for the sweet spot between detail retention and overfitting avoidance. Yet, these intricate approximations and transformations, while producing imaginative outputs, can sometimes diverge from factual precision. This gives birth to the phenomena we label as 'hallucinations' — a symphony of creation, albeit occasionally detached from the real world.

‍

Generative Models and Their Place in Digital Humanities

The ethos of Digital Humanities lies in fusing the marvels of technology with the rigorous methodologies traditionally employed in the study of human culture and history. Yet, the proclivity of generative AI models to produce 'hallucinations' seems at odds with the exacting standards upheld in disciplines such as history, geography, and cartography. This juxtaposition underscores an imperative: the need to evolve Machine Learning methodologies. To truly serve the Digital Humanities, we must prioritize absolute accuracy and precision, even if it means reigning in the unbridled creativity characteristic of generative AI.

‍

Ivan Sysoev

Exploring the landscape of machine learning concepts one gradient descent step at a time by demystifying the mathematics.

Meet on:

Don't miss out on the latest news!

Oops! Something went wrong while submitting the form.

Contribute to Historica's blog!

Learn guidelines, requirements, and join our history-loving community.

Become an author

FAQs

How can I contribute to or collaborate with the Historica project?

If you're interested in contributing to or collaborating with Historica, you can use the contact form on the Historica website to express your interest and detail how you would like to be involved. The Historica team will then be able to guide you through the process.

What role does Historica play in the promotion of culture?

Historica acts as a platform for promoting cultural objects and events by local communities. It presents these in great detail, from previously inaccessible perspectives, and in fresh contexts.

How does Historica support educational endeavors?

Historica serves as a powerful tool for research and education. It can be used in school curricula, scientific projects, educational software development, and the organization of educational events.

What benefits does Historica offer to local cultural entities and events?

Historica provides a global platform for local communities and cultural events to display their cultural artifacts and historical events. It offers detailed presentations from unique perspectives and in fresh contexts.

Can you give a brief overview of Historica?

Historica is an initiative that uses artificial intelligence to build a digital map of human history. It combines different data types to portray the progression of civilization from its inception to the present day.

What is the meaning of Historica's principles?

The principles of Historica represent its methodological, organizational, and technological foundations: Methodological principle of interdisciplinarity: This principle involves integrating knowledge from various fields to provide a comprehensive and scientifically grounded view of history. Organizational principle of decentralization: This principle encourages open collaboration from a global community, allowing everyone to contribute to the digital depiction of human history. Technological principle of reliance on AI: This principle focuses on extensively using AI to handle large data sets, reconcile different scientific domains, and continuously enrich the historical model.

Who are the intended users of Historica?

Historica is beneficial to a diverse range of users. In academia, it's valuable for educators, students, and policymakers. Culturally, it aids workers in museums, heritage conservation, tourism, and cultural event organization. For recreational purposes, it serves gamers, history enthusiasts, authors, and participants in historical reenactments.

How does Historica use artificial intelligence?

Historica uses AI to process and manage vast amounts of data from various scientific fields. This technology allows for the constant addition of new facts to the historical model and aids in resolving disagreements and contradictions in interpretation across different scientific fields.

Can anyone participate in the Historica project?

Yes, Historica encourages wide-ranging collaboration. Scholars, researchers, AI specialists, bloggers and all history enthusiasts are all welcome to contribute to the project.

Hallucinations in Generative AI Models

Hallucinations in Generative AI Models

Tracing the Roots: Generative AI Models

Deciphering the Math Behind AI Hallucinations

Generative Models and Their Place in Digital Humanities

People also read

The Minotaur Lives: How AI Unearthed Truth from Myth

Dead Kings Revived Via AI

The AI as secret agent to uncover hidden archaeological artefacts

Contribute to Historica's blog!

FAQs