When Google introduced a similar chatbot a few weeks later, it began spouting nonsensical facts about the James Webb telescope. The next day, Microsoft’s Bing chatbot provided all sorts of false information about the Gap, Mexican nightlife, and singer Billie Eilish. In March, ChatGPT referenced six non-existent court cases while drafting a 10-page legal brief for a lawyer to submit to a federal judge in Manhattan.
A new startup called Vectara, led by former Google employees, is now working to determine how frequently chatbots provide inaccurate information. The company’s research suggests that even in situations designed to prevent it, chatbots create false information at least 3% of the time – and as high as 27%.
This chatbot behavior, which experts call “hallucination,” can be a serious issue for anyone using this technology with court documents, medical information, or sensitive business data. Due to their ability to respond to requests in numerous ways, it is nearly impossible to definitively determine the frequency at which they “hallucinate.” According to Simon Hughes of Vectara, you would have to look at all of the world’s information to do so.
Research from Vectara revealed that these hallucination rates vary widely among leading AI companies. OpenAI’s technologies had the lowest rate at around 3%, while systems from Meta, which owns Facebook and Instagram, averaged around 5%. However, a Google system, Palm chat, had the highest rate at 27%. Vectara hopes that their publicly-shared methods will drive efforts across the industry to decrease hallucination rates.
Nonetheless, reducing the problem is challenging. These chatbots, which work on a form of AI called large language models (LLMs), operate on the basis of learning from enormous amounts of text data, and they rely on probabilities. This, in turn, can lead to inaccurate or false information being provided to users.
In light of these findings, the team from Vectara warns people to be wary of information from chatbots and the technology sold by companies like Vectara, which is offering this kind of technology for business use.
Despite efforts to minimize the problem, researchers caution that chatbot hallucination is a complex issue. Even languages models used to check the accuracy of summaries can make mistakes, because they too, operate on the basis of probabilities.
In conclusion, Vectara’s research suggests that while chatbots offer many benefits, they still need to find ways to minimize the likelihood of hallucination and minimize the spread of false information.