LLM providers

Waterline uses language models for three distinct tasks that have very different cost and quality requirements. Rather than locking you into a single model for everything, Waterline lets you route each task tier to its own provider and model. This page explains how those tiers work, how to configure each provider, and how to get the best value out of your setup.

The three LLM tiers

Task	Env var	Volume	Recommended model
Semantic diff — summarizes what changed in a commit or PR	`LLM_PROVIDER`	Low — once per commit	Claude Sonnet or GPT-4o
Ticket analysis — relevance scoring, criteria mapping	`ANALYSIS_LLM_PROVIDER`	Medium — once per analysis run	GPT-4o-mini or Claude Haiku
Symbol summarization — one call per function/class at index time	`SYMBOL_LLM_PROVIDER`	High — thousands of calls for a typical repo	Claude Haiku or GPT-4o-mini

Each tier falls back to the one above it if unset: SYMBOL_LLM_PROVIDER falls back to ANALYSIS_LLM_PROVIDER, which falls back to LLM_PROVIDER. You can start with just LLM_PROVIDER to get everything working, then add the cheaper tiers later to reduce costs.

Provider setup

Anthropic
OpenAI
Ollama

Set up Waterline to use Claude for all three tiers. Using Claude Haiku for symbol summarization dramatically reduces your indexing cost compared to Sonnet.

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-3-7-sonnet-latest

ANALYSIS_LLM_PROVIDER=anthropic
ANALYSIS_ANTHROPIC_MODEL=claude-haiku-4-5-20251001

SYMBOL_LLM_PROVIDER=anthropic
SYMBOL_ANTHROPIC_MODEL=claude-haiku-4-5-20251001

Anthropic does not provide an embedding API. Even when you use Claude for all LLM tasks, you must also set OPENAI_API_KEY and configure the embedding variables:

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=sk-...

OpenAI is the simplest setup — a single API key covers both LLM calls and embeddings.

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o

ANALYSIS_LLM_PROVIDER=openai
ANALYSIS_OPENAI_MODEL=gpt-4o-mini

SYMBOL_LLM_PROVIDER=openai
SYMBOL_OPENAI_MODEL=gpt-4o-mini

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

Supported models: gpt-4o, gpt-4o-mini, gpt-4-turbo, o1, o3, o4-mini

Waterline auto-detects OpenAI reasoning models (o1, o3, o4-mini) and adjusts its API calls accordingly — switching to max_completion_tokens and omitting temperature. No extra configuration is needed to use them.

Run Waterline with no external API calls by pointing it at a local Ollama instance. This is the only option that works completely air-gapped.1. Install Ollama and pull a recommended model:

ollama pull qwen2.5-coder:14b
ollama pull nomic-embed-text

qwen2.5-coder:14b is recommended for code understanding tasks. On lower-resource machines, qwen2.5-coder:7b is a workable alternative.2. Configure Waterline:

LLM_PROVIDER=ollama
OLLAMA_URL=http://localhost:11434
OLLAMA_MODEL=qwen2.5-coder:14b

ANALYSIS_LLM_PROVIDER=ollama
SYMBOL_LLM_PROVIDER=ollama

EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=nomic-embed-text

Ollama embedding vectors are not compatible with OpenAI embedding vectors. If you switch EMBEDDING_PROVIDER after indexing a repository, you must re-index from scratch. Delete the ChromaDB data directory (rm -rf ./chroma) and trigger a full sync to start fresh.

Performance expectations:

Initial symbol indexing is significantly slower than cloud APIs. A 500-file repository may take 30–60 minutes on a modern machine with qwen2.5-coder:14b.
Analysis queries typically run in 10–30 seconds per ticket depending on your hardware.
A GPU (CUDA or Apple Silicon MPS) is strongly recommended. CPU-only is possible but expect 2–5 minutes per analysis.

Recommended split: Claude + OpenAI

This is the configuration used in production for the Waterline hosted version. Claude Haiku handles the high-volume symbol summarization at low cost, GPT-4o-mini handles analysis tasks, and Claude Sonnet handles the low-volume semantic diff work where quality matters most.

# Embeddings — OpenAI only (Anthropic doesn't provide an embedding API)
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=sk-...

# Symbol indexing — Claude Haiku (~30x cheaper than Sonnet for bulk calls)
SYMBOL_LLM_PROVIDER=anthropic
SYMBOL_ANTHROPIC_MODEL=claude-haiku-4-5-20251001
ANTHROPIC_API_KEY=sk-ant-...

# Ticket analysis — GPT-4o-mini (fast and reliable for structured output)
ANALYSIS_LLM_PROVIDER=openai
ANALYSIS_OPENAI_MODEL=gpt-4o-mini

# Semantic diff / general — Claude Sonnet (best quality for low-volume tasks)
LLM_PROVIDER=anthropic
ANTHROPIC_MODEL=claude-3-7-sonnet-latest

Cost optimization tips

SYMBOL_LLM_PROVIDER drives the most LLM spending during a first-time index. Always point it at the cheapest capable model — Claude Haiku and GPT-4o-mini both work well.
LLM_PROVIDER (semantic diff) runs infrequently. This is the right place to use a higher-quality model without worrying about cost.
You can skip the analysis and symbol tier variables entirely when starting out. Set only LLM_PROVIDER and come back to split the tiers once your bill gives you a reason to.
For REPO_MAX_FILES and REPO_MAX_SYMBOLS, see the environment variables reference — these limits cap how many LLM calls a single repo index can trigger.

Get Started

Integrations

How It Works

Configuration

Self-Hosting

The three LLM tiers

Provider setup

Recommended split: Claude + OpenAI

Cost optimization tips

Get Started

Integrations

How It Works

Configuration

Self-Hosting

​The three LLM tiers

​Provider setup

​Recommended split: Claude + OpenAI

​Cost optimization tips

The three LLM tiers

Provider setup

Recommended split: Claude + OpenAI

Cost optimization tips