Skip to main content

AI Meets Databases: Vector Search & Retrieval-Augmented Generation (RAG)

·1054 words·5 mins

Artificial Intelligence thrives on context. Large Language Models (LLMs) like GPT-4 can generate text, but without relevant data from your company, ERP system, or domain knowledge base, their answers may be vague or even wrong.

This is where vector search and Retrieval-Augmented Generation (RAG) come in. By storing embeddings of your documents in a database and retrieving the most relevant pieces at query time, you can give LLMs the context they need to respond with accurate, domain-specific insights.

Think of it as giving AI a memory layer — one that’s flexible, searchable, and tightly coupled with your enterprise systems.


Why Vector Search Matters
#

Traditional search uses keywords. If a user searches for “employee onboarding process,” keyword search looks for documents containing those exact words.

But humans don’t think in keywords — we think in concepts. Vector search solves this by using embeddings (dense numerical representations of text). Instead of looking for keywords, it looks for semantic similarity.

  • "How do we train new hires?" → should match documents about onboarding.
  • "What’s the induction procedure?" → should also match onboarding docs, even without shared keywords.
quadrantChart
    title "Keyword Search vs. Vector Search"
    x-axis "Low Conceptual Match" --> "High Conceptual Match"
    y-axis "Low Keyword Match" --> "High Keyword Match"
    quadrant-1 "Vector Search Domain"
    quadrant-2 "Both Methods Work"
    quadrant-3 "Irrelevant"
    quadrant-4 "Keyword Search Domain"
    "Point A": [0.1, 0.9]
    "Point B": [0.9, 0.1]
    "Point C": [0.8, 0.8]

How to interpret this visual:
#

  • Point C (Top Right): Represents a query like “onboarding process” for the document “Employee onboarding process and training”. It has high keyword match AND high conceptual match. Both search methods find it easily.

  • Point A (Top Left): Represents a query like “train new hires?” for the same document. It has lower keyword match but a very high conceptual match. This is where Vector Search excels.

  • Point B (Bottom Right): Represents a query with synonyms that don’t match the document’s keywords. It has high conceptual match but low keyword match. This is also the domain of Vector Search.

The bottom-left quadrant would be for irrelevant documents that match on neither dimension.

This is the foundation of semantic search and the engine that powers RAG.


RAG in Action
#

flowchart TD
    A[User Query] --> B[Generate Query Embedding]
    B --> C[Vector Similarity Search]
    C --> D[Retrieve Top-K Relevant Documents]

    subgraph KnowledgeBase [Knowledge Base]
        direction LR
        E[Document 1] --embed--> F[Vector 1]
        G[Document 2] --embed--> H[Vector 2]
        I[Document N] --embed--> J[Vector N]
    end

    KnowledgeBase -.-> C
    D --> K[Augment LLM Prompt with Context]
    K --> L[LLM Generates Context-Aware Answer]
    L --> M[Final Response to User]

Retrieval-Augmented Generation works in two steps:

  1. Retrieve: The system searches for the most relevant documents (via vector search).
  2. Augment + Generate: These documents are fed into the LLM prompt, enriching it with context before generating an answer.

This means your ERP AI assistant can answer:

  • “What was the revenue for Q2 2024?”
  • “Who are our top 5 suppliers by order volume?”

…by pulling live data from your ERP database and documents before letting the LLM respond.


Tech Stack Choices
#

There are two dominant paths to implementing vector search:

  1. FAISS (Facebook AI Similarity Search)

    • High-performance vector search library in C++.
    • Integrates well with Python apps.
    • Great for prototypes and local apps.
  2. PostgreSQL with pgvector extension

    • Native database support for vector embeddings.
    • Ideal for enterprise settings where data must remain in your ERP or corporate DB.
    • Brings vector search inside SQL.

Both options serve different purposes: FAISS for speed and experimentation, pgvector for production and ERP integration.


Python + FAISS Demo
#

Let’s implement a mini semantic search engine with FAISS.

import faiss
import numpy as np
from sentence_transformers import SentenceTransformer

# Step 1: Create embeddings
model = SentenceTransformer("all-MiniLM-L6-v2")
docs = [
    "Employee onboarding process and training",
    "Quarterly financial report for 2024 Q2",
    "Supplier performance and delivery times",
    "Leave policy and attendance rules"
]
doc_embeddings = model.encode(docs)

# Step 2: Build FAISS index
dimension = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(np.array(doc_embeddings))

# Step 3: Query the index
query = "How do we train new hires?"
query_vector = model.encode([query])

D, I = index.search(query_vector, k=2)  # top 2 results
print("Query:", query)
for idx in I[0]:
    print("Match:", docs[idx])


## Output
Query: How do we train new hires?
Match: Employee onboarding process and training
Match: Leave policy and attendance rules

The system understood that “train new hires” relates to “onboarding process”.That’s semantic search in action.


PostgreSQL + pgvector Demo
#

For enterprise use cases, embedding vectors inside PostgreSQL makes sense.

Install pgvector:

CREATE EXTENSION IF NOT EXISTS vector;
Create a table with vectors:
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding VECTOR(384)  -- dimension matches model
);
Insert embeddings (via Python):
import psycopg2
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")
conn = psycopg2.connect("dbname=ai_db user=postgres password=secret")
cur = conn.cursor()

docs = ["Onboarding guide", "Financial report Q2", "Supplier contracts"]
for doc in docs:
    emb = model.encode([doc])[0].tolist()
    cur.execute("INSERT INTO documents (content, embedding) VALUES (%s, %s)", (doc, emb))
conn.commit()
Search with similarity:
SELECT id, content
FROM documents
ORDER BY embedding <-> '[0.123, 0.456, ...]'::vector
LIMIT 3;

This query ranks results by vector distance — no keywords required.


ERP Integration Angle
#

Now imagine connecting this directly with your ERP system:

  • HR module: semantic search over policies, training manuals, onboarding docs.
  • Finance module: query quarterly reports or invoices semantically.
  • Supply chain module: retrieve supplier performance reports or contracts without exact keywords.

By combining RAG + ERP data, enterprises can finally have AI copilots that understand their business language.

---
config:
  theme: forest
---
flowchart LR
    A[Enterprise Data Sources] --> B[ERP System<br>SAP, NetSuite, etc.]
    B -- Data Ingestion &<br>Embedding Generation --> C[Vector Database<br>with pgvector]
    D[User Query<br>e.g., HR, Finance, SCM] --> E[AI Copilot Interface]
    E -- Semantic Search --> C
    C -- Returns Relevant<br>Context --> E
    E -- Sends Augmented<br>Prompt --> F[Large Language Model<br>e.g., GPT-4, Llama 3]
    F -- Sends Informed<br>Response --> E
    E --> G[User]

Conclusion
#

AI without compute is weak, and AI without context is blind.

Vector search and Retrieval-Augmented Generation are the technologies bridging this gap — making AI systems domain-aware, reliable, and enterprise-ready.

Whether you start with FAISS for a prototype or PostgreSQL with pgvector for ERP integration, learning this stack sets you apart in a world where most people still think search = keywords.

This is not just about building apps; it’s about building the future of enterprise AI.