Document inteligence engine for AI

Vectorless is an ultra-performant reasoning-native document intelligence engine for AI, with the core written in Rust. It transforms documents into rich semantic trees and uses LLMs to intelligently traverse the hierarchy — retrieving the most relevant content through structural reasoning and deep contextual understanding.

⭐ Drop a star to help us grow!

How It Works

1. Index: Build a Navigable Tree

Technical Manual (root)
├── Chapter 1: Introduction
├── Chapter 2: Architecture
│   ├── 2.1 System Design
│   └── 2.2 Implementation
└── Chapter 3: API Reference

Each node gets an AI-generated summary, enabling fast navigation.

2. Query: Navigate with LLM

When you ask "How do I reset the device?":

Analyze — Understand query intent and complexity
Navigate — LLM guides tree traversal
Retrieve — Return the exact section with context
Verify — Check if more information is needed

Traditional RAG vs Vectorless

Aspect	Traditional RAG	Vectorless
Infrastructure	Vector DB + Embedding Model	Just LLM API
Document Structure	Lost in chunking	Preserved
Context	Fragment only	Section + surrounding context
Setup Time	Hours to Days	Minutes
Best For	Unstructured text	Structured documents

Example

Input:

Document: 100-page technical manual (PDF)
Query: "How do I reset the device?"

Output:

Answer: "To reset the device, hold the power button for 10 seconds 
until the LED flashes blue, then release..."

Source: Chapter 4 > Section 4.2 > Reset Procedure

When to Use

✅ Good fit:

Technical documentation
Manuals and guides
Structured reports
Policy documents
Any document with clear hierarchy

❌ Not ideal:

Unstructured text (tweets, chat logs)
Very short documents (< 1 page)
Pure Q&A datasets without structure

Quick Start

Python

pip install vectorless

from vectorless import Engine, IndexContext

# Create engine (uses OPENAI_API_KEY env var)
engine = Engine(workspace="./data")

# Index a document
ctx = IndexContext.from_file("./report.pdf")
doc_id = engine.index(ctx)

# Query
result = engine.query(doc_id, "What is the total revenue?")
print(f"Answer: {result.content}")

Rust

[dependencies]
vectorless = "0.1"

cp vectorless.example.toml ./vectorless.toml

use vectorless::Engine;

#[tokio::main]
async fn main() -> vectorless::Result<()> {
    let client = Engine::builder()
        .with_workspace("./workspace")
        .build()?;

    let doc_id = client.index("./document.pdf").await?;

    let result = client.query(&doc_id,
        "What are the system requirements?").await?;

    println!("Answer: {}", result.content);
    println!("Source: {}", result.path);

    Ok(())
}

Features

Feature	Description
Zero Infrastructure	No vector DB, no embedding model — just an LLM API
Multi-format Support	PDF, Markdown, DOCX, HTML out of the box
Incremental Updates	Add/remove documents without full re-index
Traceable Results	See the exact navigation path taken
Feedback Learning	Improves from user feedback over time
Multi-turn Queries	Handles complex questions with decomposition

Configuration

Zero Configuration (Recommended)

Just set OPENAI_API_KEY and you're ready to go:

export OPENAI_API_KEY="sk-..."

Python

from vectorless import Engine

# Uses OPENAI_API_KEY from environment
engine = Engine(workspace="./data")

Rust

use vectorless::Engine;

let client = Engine::builder()
    .with_workspace("./workspace")
    .build().await?;

Environment Variables

Variable	Description
`OPENAI_API_KEY`	LLM API key
`VECTORLESS_MODEL`	Default model (e.g., `gpt-4o-mini`)
`VECTORLESS_ENDPOINT`	API endpoint URL
`VECTORLESS_WORKSPACE`	Workspace directory

Advanced Configuration

For fine-grained control, use a config file:

cp config.toml ./vectorless.toml

Python

from vectorless import Engine

# Use full configuration file
engine = Engine(config_path="./vectorless.toml")

# Or override specific settings
engine = Engine(
    config_path="./vectorless.toml",
    model="gpt-4o",  # Override model from config
)

Rust

use vectorless::Engine;

// Use full configuration file
let client = Engine::builder()
    .with_config_path("./vectorless.toml")
    .build().await?;

// Or override specific settings
let client = Engine::builder()
    .with_config_path("./vectorless.toml")
    .with_model("gpt-4o", None)  // Override model
    .build().await?;

Configuration Priority

Later overrides earlier:

Default configuration
Auto-detected config file (vectorless.toml, config.toml, .vectorless.toml)
Explicit config file (config_path / with_config_path)
Environment variables
Constructor/builder parameters (highest priority)

Architecture

Core Components

Index Pipeline — Parses documents, builds tree, generates summaries
Retrieval Pipeline — Analyzes query, navigates tree, returns results
Pilot — LLM-powered navigator that guides retrieval decisions
Metrics Hub — Unified observability for LLM calls, retrieval, and feedback

Examples

See the examples/ directory for more usage patterns.

Contributing

Contributions welcome! If you find this useful, please ⭐ the repo — it helps others discover it.

Star History

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 265 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
python		python
rust		rust
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
vectorless.example.toml		vectorless.example.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document inteligence engine for AI

How It Works

1. Index: Build a Navigable Tree

2. Query: Navigate with LLM

Traditional RAG vs Vectorless

Example

When to Use

Quick Start

Features

Configuration

Zero Configuration (Recommended)

Environment Variables

Advanced Configuration

Configuration Priority

Architecture

Core Components

Examples

Contributing

Star History

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Document inteligence engine for AI

How It Works

1. Index: Build a Navigable Tree

2. Query: Navigate with LLM

Traditional RAG vs Vectorless

Example

When to Use

Quick Start

Features

Configuration

Zero Configuration (Recommended)

Environment Variables

Advanced Configuration

Configuration Priority

Architecture

Core Components

Examples

Contributing

Star History

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages