What is PageIndex and how does it differ from vector databases?

PageIndex is an open-source vectorless RAG framework that builds hierarchical tree indexes of documents instead of chunking and embedding them. It uses LLM reasoning to navigate document structure, achieving 98.7% accuracy on FinanceBench compared to 60-80% for traditional vector RAG.

When should I use PageIndex over a vector database?

Use PageIndex when your documents are long and structured (financial reports, legal filings, technical manuals), accuracy on complex multi-step questions is critical, and your corpus is hundreds or thousands of documents. Use vector databases for millions of short-form documents, sub-50ms latency requirements, or unstructured content.

Can I use PageIndex and a vector database together?

Yes. A hybrid approach uses vector search for fast initial retrieval across a large corpus, then PageIndex for deep reasoning within retrieved documents. Teams report accuracy improvements from 71% (vector-only) to 94% (hybrid) with this architecture.

Is PageIndex more expensive than vector databases?

It depends on query volume. PageIndex shifts costs from infrastructure to LLM inference. For low-to-medium query volume on complex documents, PageIndex is often cheaper. For high-volume queries on large corpora, vector databases are more cost-effective.

Back to Blog

RAG

Vector Database

PageIndex

AI Architecture

LLM

Production

PageIndex vs Vector Database: Why Vectorless RAG Might Be the Smarter Choice in 2026

PageIndex hit 98.7% accuracy on FinanceBench while traditional vector RAG sits at 60-80%. We break down the architecture differences, benchmark numbers, and a practical decision framework for choosing the right retrieval approach.

Pawel OwerczukMarch 15, 202620 min

What if the entire foundation of your RAG pipeline — the chunking, the embeddings, the vector similarity search — is the reason your AI keeps hallucinating on complex documents?

If you've spent any time building retrieval-augmented generation systems, you know the pain. You chunk a 200-page SEC filing into neat little pieces, embed them into Pinecone or Qdrant, fire off a query, and get back... fragments that miss the actual answer by three sections. Traditional vector databases have carried RAG this far, but a new open-source project called PageIndex just posted 98.7% accuracy on FinanceBench — while vector-based RAG systems typically land somewhere between 60% and 80% on the same benchmark. That gap is hard to ignore.

In this article, we'll break down how PageIndex actually works, where vector databases still shine, and help you figure out which approach fits your use case. We'll cover the core architecture differences, real benchmark numbers, cost and complexity tradeoffs, and practical scenarios where each approach wins.

How Vector Database RAG Actually Works (And Where It Breaks)

Before we talk about what PageIndex does differently, let's be honest about what vector database RAG does well — and where it falls apart.

The standard RAG pipeline goes like this: take your documents, split them into chunks (usually 500-1000 tokens), run each chunk through an embedding model to get a numerical vector, store those vectors in a database like Pinecone, Weaviate, Qdrant, or Milvus, and then at query time, embed the user's question and find the most "similar" chunks by cosine distance.

This works surprisingly well for simple lookups. Need to find a paragraph that discusses "employee benefits policy"? Vector similarity nails it. The embedding captures semantic meaning, and the closest vectors usually contain relevant text.

But here's where things get ugly. Take a financial analyst asking: "What was the year-over-year change in operating margin for Q3 2025 compared to Q3 2024?" To answer that, your system needs data from two different sections of a 10-K filing, needs to understand the relationship between those sections, and needs to perform a calculation. A vector similarity search doesn't reason — it just finds text that looks similar to the question. And "looks similar" and "contains the answer" are two very different things.

Marco, a machine learning engineer at a fintech startup in Berlin, learned this the hard way. His team spent four months building a RAG pipeline with Qdrant for analyzing earnings reports. The retrieval accuracy on simple factual questions was around 78%. But when analysts asked comparative or multi-step questions — the kind that actually matter in financial analysis — accuracy dropped to roughly 40%. They tried recursive chunking, semantic chunking, hybrid search with BM25. Each tweak moved the needle by a few percentage points. Nothing came close to the reliability their compliance team needed.

The Chunking Problem Nobody Wants to Talk About

The dirty secret of vector database RAG is that chunking destroys document structure. A 10-K filing has a carefully designed hierarchy: sections, subsections, tables, footnotes, cross-references. When you chop it into 512-token pieces, you lose all of that.

Stack Overflow's engineering blog put it bluntly: "Breaking up is hard to do." There's no single best chunking strategy. Fixed-size chunking is fast but context-blind. Semantic chunking is smarter but expensive. Recursive chunking tries to respect structure but still fragments tables and multi-page narratives.

The result? Your vector database faithfully stores thousands of decontextualized fragments. It retrieves the ones that are semantically closest to your query. But "closest" doesn't mean "correct" — especially when the answer requires understanding how pieces of the document relate to each other.

How PageIndex Works: Trees Instead of Vectors

PageIndex takes a fundamentally different approach. Instead of breaking documents into chunks and embedding them, it builds a hierarchical tree index — essentially a machine-readable table of contents — and uses LLM reasoning to navigate that tree.

The process has two phases.

Phase 1 — Index Generation. PageIndex reads your document and constructs a tree structure that mirrors the document's natural organization. Headers become parent nodes. Subsections become children. Each node gets metadata: a title, page range, summary, and unique identifier. Think of it like a librarian building a card catalog, except the catalog preserves the full structure of every book.

Phase 2 — Reasoning-Based Retrieval. When a query comes in, instead of doing a nearest-neighbor search in vector space, PageIndex sends the tree index to an LLM and asks it to reason about which sections are relevant. The LLM traverses the tree top-down, making decisions at each level: "This question is about operating margins, so I should look under Financial Statements → Income Statement → Operating Expenses." It navigates the document the same way a human expert would.

No embeddings. No vectors. No chunking. Just structured reasoning over document architecture.

Why This Matters: The FinanceBench Results

FinanceBench is the standard benchmark for evaluating how well AI systems answer questions about SEC filings. The questions range from simple factual lookups to multi-step calculations requiring cross-referencing multiple sections.

Here's how the numbers stack up:

System	FinanceBench Accuracy	Approach
Mafin 2.5 (PageIndex)	98.7%	Vectorless tree indexing
Traditional vector RAG	60-80%	Chunking + embeddings
Perplexity	~45%	General-purpose retrieval
GPT-4o (no RAG)	~31%	Pure LLM knowledge

That 98.7% isn't a cherry-picked number — Mafin 2.5 covered 100% of the benchmark questions. And the jump from Mafin 1.0 (38%) to Mafin 2.5 (98.7%) shows how much the tree-indexing approach has matured in a short time.

Want to test PageIndex on your own documents? The open-source repo includes cookbooks and a Colab notebook to get started in under 10 minutes.

Where Vector Databases Still Win

Let's not pretend PageIndex kills vector databases. It doesn't. There are real scenarios where vector search is the better tool.

Scale and Speed

If you're searching across millions of documents — not pages within a single document, but millions of separate documents — vector databases are built for that. Pinecone handles sub-50ms queries at scale. Qdrant, written in Rust, delivers exceptional throughput for on-premises deployments. Milvus can store billions of vectors on NVMe SSDs instead of RAM, cutting infrastructure costs by 10x at massive scale.

PageIndex's LLM reasoning step is inherently slower per query. It's making API calls to GPT-4o (or whatever model you configure) for each retrieval. For a single complex document, that's fine. For a real-time search across a million product descriptions? Vector databases win by a landslide.

Unstructured, Short-Form Content

PageIndex's tree structure shines on long, hierarchical documents — financial filings, legal contracts, technical manuals, academic papers. But if your corpus is a collection of Slack messages, support tickets, or product reviews, there's no inherent hierarchy to exploit. Vector similarity search over short text snippets is still the most practical approach for that kind of data.

Mature Ecosystem and Tooling

Vector databases have years of ecosystem development behind them. Pinecone, Weaviate, Qdrant, Milvus, and Chroma all have robust client libraries, integrations with LangChain and LlamaIndex, managed cloud offerings, monitoring dashboards, and battle-tested production deployments. PageIndex is open-source and growing fast, but it's newer. If you need enterprise support contracts and SLA guarantees today, the vector database ecosystem is further along.

Real-World Decision Framework: Which One Should You Use?

Let's cut through the hype and get practical. Here's a framework for choosing.

Choose PageIndex when:

Your documents are long and structured (financial reports, legal filings, regulatory documents, technical manuals)
Accuracy on complex, multi-step questions is non-negotiable
You need an audit trail showing exactly how the system found its answer
Your corpus is measured in hundreds or thousands of documents, not millions
You're working in regulated industries where "close enough" retrieval isn't acceptable

Choose a vector database when:

You're searching across millions of short-form documents or records
Latency matters more than per-query accuracy (sub-50ms requirements)
Your content lacks clear hierarchical structure
You need real-time semantic search at massive scale
Your team already has vector database expertise and infrastructure

Consider a hybrid approach when:

You need both: fast initial retrieval across a large corpus (vector search) followed by deep reasoning within retrieved documents (PageIndex)
Your pipeline handles mixed content types — some structured, some not
You're building a system that routes queries to different retrieval strategies based on complexity

Elena, an AI architect at a compliance SaaS company in Amsterdam, landed on exactly this hybrid setup. Her team uses Weaviate to quickly surface the 5-10 most relevant documents from a library of 50,000 regulatory filings. Then PageIndex takes over, building a tree index of each retrieved document and reasoning through it to extract precise answers. The combination gives her team both the breadth of vector search and the depth of structured reasoning. Retrieval accuracy on their internal benchmark went from 71% (vector-only) to 94% (hybrid).

Cost and Complexity: What Nobody Mentions

There's a practical tradeoff most comparison articles skip: cost.

Vector databases have infrastructure costs that scale with your data. Pinecone's managed service starts affordable but climbs as you add vectors. Weaviate and Qdrant can run self-hosted, but you're paying for compute and memory — and vector search is memory-hungry. At enterprise scale with billions of vectors, you're looking at significant monthly bills.

PageIndex shifts the cost from infrastructure to inference. You're not paying for vector storage or embedding generation. But you are paying for LLM API calls on every retrieval. With GPT-4o, that's roughly $2.50 per million input tokens and $10 per million output tokens. For a corpus of a few thousand documents queried a few hundred times a day, this is often cheaper than running a managed vector database. For millions of queries per day, the LLM costs add up fast.

The sweet spot? PageIndex tends to be more cost-effective for low-to-medium query volume on high-complexity documents. Vector databases tend to be more cost-effective for high-volume queries on large, simple corpora.

Factor	PageIndex	Vector Database
Infrastructure	Minimal (just LLM API)	Significant (compute + memory + storage)
Per-query cost	Higher (LLM inference)	Lower (vector similarity is cheap)
Setup complexity	Simple (pip install + API key)	Moderate to high (infra, embeddings, tuning)
Scaling cost curve	Linear with queries	Linear with data volume
Best economics	Low volume, high accuracy needs	High volume, large corpus

Getting Started with PageIndex

If you want to try PageIndex, the setup is minimal. Install from pip, point it at your documents, and let it build the tree index. The GitHub repo has a cookbook with a simple RAG notebook you can run in Google Colab right now.

For production use, PageIndex offers MCP (Model Context Protocol) support, a cloud chat platform at chat.pageindex.ai, and an API in beta. Enterprise on-premises deployment is available if your data can't leave your infrastructure.

Start with one document. Take your most problematic PDF — the one your current RAG pipeline keeps getting wrong — and run it through PageIndex. Compare the answers side by side. That single test will tell you more than any benchmark table.

The Bottom Line

PageIndex and vector databases solve retrieval differently, and the right choice depends on what you're actually building.

Vector databases remain the best option for high-volume semantic search across large, unstructured corpora where sub-50ms latency matters. They've earned their place in the stack.

But if your RAG pipeline struggles with long, structured documents — if accuracy on complex questions matters more than raw query speed — PageIndex's vectorless approach delivers results that vector similarity search simply can't match. That 98.7% on FinanceBench isn't theoretical. It's a benchmark number on real financial filings, answering real analyst questions.

The smartest teams in 2026 aren't picking sides. They're using vector search for breadth and PageIndex for depth, routing queries to the right tool based on complexity. That hybrid architecture is where the field is heading.

Ready to test it yourself? Clone the PageIndex repo, run the Colab cookbook on your own documents, and see the difference firsthand. No credit card, no signup — just open source.

Pawel Owerczuk

AI Agent & RAG Developer

AI Agent & RAG Developer with 10+ years of software engineering experience. Specialized in intelligent AI solutions for enterprises in the DACH & Nordic region.