Skip to main content
Waterline produces a progress score by working through your ticket and codebase together — not by guessing from commit messages or PR titles. When you request an analysis, Waterline searches for the code most likely to implement your ticket’s acceptance criteria, evaluates it, and returns a score you can trace back to specific functions and files.

The analysis pipeline

Here is what happens from the moment you request a progress check to the moment you see a result:
1

Cache check

Before running any analysis, Waterline checks whether a recent result already exists for this ticket and repository combination. If a result is less than one hour old (configurable via PROGRESS_CACHE_TTL_HOURS), it’s returned immediately — no LLM calls, no waiting.This means the second time you check the same ticket within an hour, the result is instant.
2

Fetch the ticket

Waterline retrieves the ticket’s summary, description, and any structured acceptance criteria from Jira or GitHub Issues. This full text is used in every subsequent step.
3

Semantic code search

The ticket text is converted to an embedding and used to search your indexed codebase for the symbols most likely to implement what the ticket describes. Waterline retrieves the top 30 nearest symbols by semantic similarity, then removes any that fall below the similarity threshold.If not enough symbols are found at the symbol level, Waterline falls back to file-level search automatically.
4

LLM relevance scoring

Each candidate symbol is scored by the analysis LLM on a 0–1 scale for relevance to the ticket. Symbols that matched the embedding by coincidence — not because they’re actually about the ticket’s domain — are filtered out here.Only the top candidates (up to 20 by default) proceed to the next step.
5

Extract acceptance criteria

A separate LLM call reads the ticket description and extracts discrete, individually assessable acceptance criteria.For example, given a ticket description like:
The user should be able to reset their password via email. The reset link should expire after 24 hours. An error message should display if the link is expired.
Waterline extracts:
  1. User can reset password via email
  2. Reset link expires after 24 hours
  3. Error message shown for expired link
6

Map evidence to criteria

The high-scoring symbols from the relevance step are mapped to the extracted criteria. The LLM determines which symbols provide evidence for which criteria, and assigns a confidence score to each mapping.
7

Deterministic aggregation

Confidence scores are converted to a state for each criterion using fixed thresholds — no LLM judgment at this step:
ConfidenceState
≥ 0.75SATISFIED
0.40 – 0.74PARTIAL
< 0.40UNSATISFIED
The overall progress percentage is then:
progress % = (SATISFIED criteria / total criteria) × 100

Fast path: pre-computed alignments

For tickets you’ve analyzed before, Waterline doesn’t wait for you to ask. Every time new code is pushed and the index updates, Waterline re-evaluates any previously analyzed tickets in the background and warms the cache. This means that for active tickets in repos with regular pushes, your analysis result is always fresh and available the moment you request it.
The fast path only applies to tickets that have been analyzed at least once before. The first analysis of a new ticket always runs the full pipeline.

Result structure

Every analysis returns a ProgressInference object. Here is an example:
{
  "ticket_key": "PROJ-123",
  "progress_percent": 73,
  "criteria_evaluations": [
    {
      "criterion": "User can reset password via email",
      "state": "SATISFIED",
      "confidence": 0.88,
      "evidence": ["auth/password_reset.py::send_reset_email", "auth/password_reset.py::create_reset_token"]
    },
    {
      "criterion": "Reset link expires after 24 hours",
      "state": "PARTIAL",
      "confidence": 0.61,
      "evidence": ["auth/password_reset.py::create_reset_token"]
    },
    {
      "criterion": "Error message shown for expired link",
      "state": "UNSATISFIED",
      "confidence": 0.12,
      "evidence": []
    }
  ],
  "uncertainty_level": "LOW",
  "analyzed_at": "2025-04-21T10:00:00Z"
}
Here is what each field tells you:
FieldWhat it means for you
ticket_keyThe ticket this result is for
progress_percentThe share of acceptance criteria that are fully satisfied by code in your codebase
criteria_evaluationsThe per-criterion breakdown — what was found, what wasn’t, and what’s partially there
criterionThe acceptance criterion as extracted from your ticket description
stateWhether this criterion is SATISFIED, PARTIAL, or UNSATISFIED
confidenceHow strongly the evidence supports this criterion (0–1)
evidenceThe specific functions and methods that support this criterion, as file::symbol references
uncertainty_levelHow much confidence Waterline has in the overall analysis — LOW, MEDIUM, or HIGH
analyzed_atWhen this result was computed
The evidence list is the most actionable part of the result. If a criterion shows UNSATISFIED with an empty evidence list, you know that code either hasn’t been written yet or hasn’t been indexed. If it shows PARTIAL, you can look at the listed symbols to understand what exists and what might be missing.