Curb Your Hallucination: Open Source Vector Search for AI
Here’s why vector search is increasingly critical to AI success, and how open source can fit into that equation.
- By Ben Bromhead
- October 18, 2024
AI hallucination is a mortal threat to AI-powered applications, and vector search capabilities are an increasingly critical asset as enterprises strive to meet fast-moving AI objectives. Ok, well maybe not as mortal of a threat as mass misinformation, increase in energy and water consumption, deepfakes, and, if you buy into it…our impending doom due to AGI. But outside of those risks, hallucinations are the fastest way to undermine the significant and high-pressure investments companies are making to develop and deploy LLMs. Instead of improving customer experiences and reliable decision-making support, hallucinating AI systems can produce false, embarrassing, or even dangerous responses to user queries.
A retrieval-augmented generation (RAG) architecture leveraging vector-search-capable databases can directly address AI hallucination while enhancing the efficiency and performance of AI applications. Even better: many robust and established open source technologies now include effective vector search capabilities. Even, even better: you are probably already using one or more of those open source solutions.
Here’s why vector search is increasingly critical to AI success, and how open source can fit into that equation.
The Vector Search Opportunity
Enterprises enlist data scientists and engineers to deploy, manage, and iterate data layer infrastructure that supports AI models. In most cases, enterprises don’t train their own models (it can be prohibitively expensive) but instead rely on outsourced foundational models that aren’t specific to the tasks they intend to complete. Data scientists who do train LLMs use massive stores of data, which are not specifically limited to their own company, their AI product, or any specialist domain. Those data sets also represent a snapshot in time—with no awareness of the current scenario and context that an AI application is intended to serve. As a result, LLMs can all-too-easily hallucinate and provide inaccurate information when they lack a sufficient contextual understanding of the queries posed to them.
Databases that enable vector search capabilities store embedding vectors—sets of numerical data that represent spatial coordinates. Each piece of training data (whether text, an image, audio, or something else) then has a specified location in dimensional space. For example, words or images that are similar to each other in certain measures will have coordinates that place them closer together along those dimensions. Comparing these locations reveals insights that help an LLM achieve contextual understanding and, therefore, greater accuracy. That in turn means vector search capabilities can empower applications to deliver more precise generative AI answers, search results, and other capabilities because those responses come from a smaller set of more contextually appropriate data sources, closer by in the vector neighborhood.
Vector Search Vs. Traditional Search
Enterprise LLMs implemented without the benefits of a RAG architecture that uses vector search must perform keyword searches using traditional search engine technology—which searches data with poor semantic comprehension of context. This activity can be inefficient, as traditional search methods may miss related concepts despite having good accuracy for finding specific items (e.g., exact-match keyword search). This can place a tremendous drag on LLM performance by not providing all relevant context, or by providing misleading context that has minimal semantic relevance. It creates an environment ripe for hallucinations and poor results, since the LLM has no systematic assurances that its traditional search method has located information that’s even in the same ballpark as the query’s contextual meaning.
Vector search—especially implementing a RAG approach utilizing vector data stores—is a stark alternative. Instead of relying on a traditional search engine approach, vector search uses the numerical embeddings of vectors to resolve queries. Therefore, searches examine a limited data set of more contextually relevant data. The results include improved performance, earned by efficiently utilizing massive data sets, and greatly decreased risk of AI hallucinations. At the same time, the more accurate answers that AI applications provide when backed by vector search enhance the outcomes and value delivered by those solutions.
Combining both vector and traditional search methods into hybrid queries will give you the best of both worlds. Hybrid search ensures you cover all semantically related context, and traditional search can provide the specificity required for critical components (e.g., restricting content to a date, time, geographical area, or unique identifier).
Leveraging Open Source Vector Search Options
Several open source technologies offer an easy on-ramp to building vector search capabilities and a path free from proprietary expenses, inflexibility, and vendor lock-in risks. To offer specific examples, Apache Cassandra 5.0, PostgreSQL (with pgvector), and OpenSearch are all open source data technologies that now offer enterprise-ready vector search capabilities and underlying data infrastructure well-suited for AI projects at scale. The broad talent base of available experts and managed providers working on these technologies (and the open source projects’ massive supportive communities) enable enterprises to tap into the full advantages of open source with immediacy and assurance.
Ground AI in Reality
Some of the same open source technologies already popular with enterprises are ready to offer even more benefits in the AI era. Teams looking for AI infrastructure with proven scalability, availability, security, efficient data management, and simplified operations should explore their open source options and the advantages of vector search. Using open-source-enabled vector search capabilities to harness data sets with greater reliability and performance is a clear path to more successful AI projects, more accurate AI-powered experiences, and ultimately more enterprise victories in the marketplace.
About the Author
Ben Bromhead is the chief technology officer at NetApp Instaclustr. Prior to co-founding Instaclustr in 2012 (acquired by NetApp in 2022), Ben was an independent consultant developing NoSQL solutions for enterprises.