Features¶

Core Capabilities¶

Vectorless Retrieval¶

The defining feature. Instead of embedding documents into vector spaces, the system builds a hierarchical tree index and uses LLM reasoning to find relevant sections. This means:

No embedding model to choose or manage
No vector database to deploy or scale
No chunk size tuning -- the document's natural structure is used
Full explainability -- you can read the LLM's reasoning for every retrieval decision

Multi-Document Support¶

Upload multiple documents to a workspace and ask questions that span across them:

Intelligent routing -- an LLM examines document summaries and selects 1-3 most relevant documents
Per-document RAG -- each selected document goes through the full tree search pipeline
Answer merging -- multiple per-document answers are synthesized into a single coherent response with cross-document citations

Multimodal Processing¶

Documents aren't just text. The system extracts and processes images:

PDF image extraction -- pages containing images are rendered and extracted
Image-aware summaries -- tree node summaries can incorporate visual content
Multimodal answer generation -- the LLM receives both text context and images, enabling answers like "As shown in the diagram on Page 12..."
Configurable limits: max images per section, max images in context

Streaming Responses¶

Chat responses stream token-by-token via Server-Sent Events (SSE):

Real-time feedback as the answer is generated
Works through the Nginx reverse proxy with buffering disabled
Supported in both the React UI and the OpenAI-compatible API endpoint

Document Management¶

Workspace Organization¶

Create workspaces to group related documents
Rename and describe workspaces with inline editing
Delete workspaces with cascade cleanup (all documents and cached indices)
Per-user isolation -- workspaces are scoped to usernames

Document Operations¶

Operation	Description
Upload	Drag-and-drop or file picker, with real-time progress tracking
Edit Metadata	Rename documents and update titles without re-indexing
Replace (Version)	Upload a new version of a document -- re-indexes while keeping the same document ID
Delete	Remove document and its cached tree index
Deduplication	MD5 hash check prevents uploading the same file twice to a workspace

Supported File Types¶

Format	Extensions	Parser	Features
PDF	`.pdf`	pypdfium2	Heuristic heading detection, image extraction, page-by-page fallback
Markdown	`.md`, `.markdown`	Built-in	Section splitting on headings
Word	`.docx`	python-docx	Paragraphs and tables
PowerPoint	`.pptx`	python-pptx	Slide-by-slide extraction
Plain Text	`.txt`	Built-in	Line-by-line parsing

User Interface¶

Chat Interface¶

Clean, modern chat UI with message history
Markdown rendering with GitHub-Flavored Markdown support
Message timestamps and clear visual distinction between user and AI messages
Conversation clearing and management

RAG Explorer Panel¶

The standout UI feature -- a four-tab panel that shows exactly how the system found each answer:

Tree ViewReasoningContextImages

Displays the complete hierarchical document structure. Click any node to see its full details including text, images, and metadata. Selected nodes (used for the current answer) are highlighted.

Shows the LLM's explanation for why specific sections were selected. This is the key to debugging and understanding retrieval decisions.

Displays the exact text that was assembled and sent to the answer-generation LLM. What you see is what the LLM saw -- complete transparency.

Shows any images extracted from the selected document sections. Useful for verifying that visual content (charts, diagrams) was included in the LLM's context.

Settings Panel¶

LLM Provider toggle (Anthropic / OpenAI)
Temperature control for response creativity
Max Tokens configuration
Quick Index toggle (skip LLM summaries during upload)
RAG Panel visibility toggle

API Compatibility¶

OpenAI-Compatible Endpoint¶

The system exposes an OpenAI-compatible /v1/chat/completions endpoint, making it a drop-in replacement for any client that speaks the OpenAI protocol:

curl http://localhost:8100/v1/chat/completions \
  -H "Authorization: Bearer pageindex-secret-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "pageindex-ws-1",
    "messages": [{"role": "user", "content": "What is the main finding?"}],
    "stream": true
  }'

This means you can use Vectorless RAG with:

Open WebUI -- as a custom model backend
ChatBox -- or any OpenAI-compatible desktop client
Custom applications -- any app that uses the OpenAI SDK

Rich Chat Endpoint¶

For the React frontend, a dedicated /api/chat/query endpoint returns full RAG metadata alongside the answer:

Selected node IDs
LLM reasoning for section selection
Assembled context text
Extracted images
Document routing information (for multi-doc queries)

Indexing Options¶

Full Index (with LLM Summaries)¶

Generates concise (up to 50 words) summaries for every tree node
Bottom-up: leaf nodes summarized from text, parent nodes from children's summaries
Multimodal summaries: nodes with images use generate_multimodal() to incorporate visual content
Produces the highest-quality tree for search
Takes longer (one LLM call per node)

Quick Index¶

Skips LLM summary generation
Uses text snippets as node summaries instead
Nearly instant indexing
Slightly lower search accuracy (summaries are less informative)
Great for rapid prototyping or when LLM costs matter

Caching & Performance¶

Tree Index Caching¶

Built trees are serialized to JSON and stored on disk
Cache key: {username}/{file_hash}.json
Subsequent queries load the cached tree instantly
Document replacement clears the old cache and creates a new one
Docker volume (index_data) persists cache across container restarts

Content Deduplication¶

Every uploaded file is MD5 hashed
If a file with the same hash already exists in a workspace, upload is rejected
Prevents wasted indexing time and storage

CI/CD & Deployment¶

Docker Compose¶

Full stack deployment with three containers:

PostgreSQL 16 -- persistent storage
FastAPI Backend -- Python 3.11, all RAG logic
React Frontend -- Nginx + static build

GitHub Actions Pipelines¶

Automated Docker image builds on every push to main:

Backend pipeline -- triggered by changes to backend/, config/, indexer/, llm/, parsers/, retriever/
Frontend pipeline -- triggered by changes to frontend/
Images pushed to GitHub Container Registry (ghcr.io)
Pull request builds validate without pushing
Docker Buildx with GitHub Actions cache for fast builds