Avoiding the Risks of LLM Hallucinations in SQL Generation

AI is changing how we work, and data analysis is a big area that’s getting a boost. Companies are looking for ways to get more out of their information, and AI tools that analyze data could be the answer. But there’s a catch – even cutting-edge use of fine-tuned LLMs still make mistakes that lead to wrong numbers and confusing results. Worse, those mistakes are often not obvious and can lead to businesses making decisions on incorrect results.

At Unsupervised, we use LLMs to talk to people, but we don’t use LLMs to directly access data. Instead, we use a different form of AI that can natively traverse and understand data. That allows us to get answers quickly and more accurately.

To show the difference, let’s look at two different AI analysis tools – one from Databricks and one from Unsupervised – as they work on the same problem: finding out how many customers get special price cuts because of state or federal subsidies from a telecom company’s records.

We’ve made two videos showing the two tools in action.

Video 1: Databricks IQ

The first video shows the Databricks tool trying to write a SQL query to count these customers. It tries but gets the tables mixed up and ends up counting something completely different. A small, early mistake on the table name leads to the AI fixing the wrong problem in its code and forgetting what question it was trying to answer – the AI thinks it did it right, but it didn’t.

 

Video 2: Unsupervised

Our AI Analyst breaks the user request into multiple jobs and assigns them to appropriate AI agents, including data querying agents that don’t use an LLM but instead use our linguistic data layer to directly access and understand the data fields across multiple tables in the schema. It doesn’t just spit out code; it understands what’s being asked at a deeper level. It double-checks its work and then gives us the right number, all by itself, in just over a minute and a half.

 

The risks of using LLMs to access and analyze data

The key difference between these two approaches is whether you’re using an LLM to manage the interface with the data. Databricks has built an impressive copilot and it can help an analyst write SQL queries faster, but because the LLM is managing the interface to the data without understanding the underlying data, it is fundamentally error-prone and relies on humans to correct its mistakes. This query is even incredibly simple – relying only on a single table – yet the problems with this approach are evident even here.

Why does this matter? Well, in 2024 we know that AI tools are going to help us do our jobs better. But we also need to be careful to pick the right tools that don’t make these invisible mistakes. We hope our comparison helps you see the importance of choosing the right AI when it comes to data.

Want to give Unsupervised a try with your data? Reach out to our team. And keep an eye out for new features coming soon, including the Data Analytics Benchmark, which will let you see public metrics and detailed videos for these types of tests on AI Analysts. Sign up for our newsletter below to get notified about these launches.

Subscribe for updates

By subscribing, you agree to our Privacy Policy.

Keep Exploring

Prepare for Take Off

Find more insights. Understand why metrics are moving. Start with a quick look at our platform.