RAG Search — Function Reference

⚠ HERITAGE EXTRACTION — donor AMS RAG search (Wave 8 quarantine)

This file extracts the donor AMS RAG/embeddings pipeline from src/analysis/ (deleted R53). The 3-tool rag_* family, the analysis_* family, the OpenAI/local hash embedding fallback, and the mcp_rag_* SQLite tables are donor accretion. Phase 0 Colibri ships no RAG, no embeddings, no semantic search, and no δ Model Router — δ is deferred to Phase 1.5 per ADR-005, and RAG/embedding wiring rides with it. The Phase 0 tool surface is in ../mcp-tools-phase-0.md.

Read this file as donor genealogy only.

Core Algorithm

Three-phase pipeline: Index → Search → Rerank.

Index Phase (ragIndex / indexRagSourceDocument)

body text
  → chunkText(body, chunkSize, chunkOverlap)   # sliding window chunks
  → dbReplaceRagChunks(documentId, chunks)     # store chunks in SQLite
  → createEmbeddings(chunkTexts, options)       # OpenAI or local hash fallback
  → dbUpsertRagEmbedding(chunkId, model, dims, vector)

Search Phase (ragSearch)

query string
  → dbWaitForRagIndexQueue(waitMs)              # flush pending index writes
  → dbSearchRagChunks(query, candidateLimit)    # FTS5 BM25 candidates
  → createQueryEmbedding(query)                 # embed query
  → dbGetRagEmbeddingsByChunkIds(chunkIds)      # load stored vectors
  → score each candidate (FTS + cosine + overlap)
  → quality filter (minScore, minSemanticScore, minTokenOverlap)
  → applySourceDiversification(maxPerSource)
  → expandWithNeighborChunks(window)            # context window expansion
  → buildGraphPayload(results)                  # knowledge graph
  → applyGraphRerank(results, graph)            # graph-signal reranking

Exported Functions

ragSearch(options): Promise<object>

File: src/analysis/rag-search.js Purpose: Main search entry point. Hybrid FTS + semantic + graph reranking. Algorithm:

  1. dbWaitForRagIndexQueue(waitForIndexMs) — up to 10000 ms.
  2. Fetch FTS candidates: dbSearchRagChunks(query, candidateLimit) (BM25 ranked).
  3. If mode ≠ “fts” and no candidates: fallback to dbListRagChunks(candidateLimit).
  4. Build keywordScores map: normalizeRankScore(rank, total) = 1 - rank/(total-1).
  5. If mode ≠ “fts”: createQueryEmbedding(query) → load stored embeddings.
  6. Score each candidate:
    • overlapScore = countTokenOverlap(queryTokens, candidate) / queryTokens.length
    • semanticScore = normalizeCosineScore(cosineSimilarity(queryVec, chunkVec))
    • FTS mode: finalScore = max(keywordScore, overlapScore)
    • Semantic mode: finalScore = 0.85×semantic + 0.15×overlap
    • Hybrid mode: finalScore = semanticWeight×semantic + keywordWeight×max(keyword, overlap×0.8)
  7. Filter: score >= minScore, semanticScore >= minSemanticScore, overlapTokens >= minTokenOverlap.
  8. applySourceDiversification(qualityFiltered, diversifyBySource, maxPerSource).
  9. Take top limit → seed results.
  10. expandWithNeighborChunks(seedResults, neighborWindow).
  11. buildGraphPayload(expansion.results, options).
  12. applyGraphRerank(expansion.results, graph, { enabled, weight }). Parameters:
Param Default Range
query required string
mode “hybrid” “fts” / “semantic” / “hybrid”
limit 10 1–100
candidate_limit max(60, limit×6) limit–500
semantic_weight 0.7 0–1
neighbor_window 0 0–5
min_score mode-dependent float
min_semantic_score mode-dependent float
min_token_overlap mode-dependent int
diversify_by_source true bool
max_per_source 3 1–20
include_graph true bool
graph_rerank false bool
graph_rerank_weight 0.2 0–0.75
wait_for_index_ms 1000 0–10000

Returns: { success, query, mode, limit, candidateLimit, weights, quality{…}, neighbor{…}, rerank{…}, degraded, qualityTier, embedding{…}, filters, count, results[], graph }


ragIndex(options): Promise<object>

File: src/analysis/rag-search.js Purpose: Batch indexing of context, action, and thought records. Algorithm:

  1. Fetch rows per source_types (default: context, action, thought) up to limit (1–5000).
  2. For each row, call the appropriate index*Record() wrapper.
  3. Skip if content_unchanged (hash comparison) and not force=true.
  4. Chunk, embed, store.
  5. Aggregate embedding summary (providers, models, fallback count).
  6. Return quality tier: “high” (OpenAI, no fallback), “local” (local-only), “degraded” (fallback used). Chunk config: chunk_size default 1200, overlap default 200, max chunk 6000, max overlap 1000. Parallelism: parallel_batches (1–32), checkpoint_every (1–10000).

indexRagSourceDocument(source, options): Promise<object>

File: src/analysis/rag-search.js Purpose: Index a single document with chunking and embedding. Algorithm:

  1. Validate sourceType["context", "action", "thought"].
  2. normalizeSourceId(sourceId) → non-negative integer.
  3. normalizeToText(body) → string; skip if empty.
  4. sha256(body)contentHash.
  5. dbGetRagDocumentBySource() → check existing; if unchanged and !force, return skipped.
  6. dbUpsertRagDocument() → upsert record.
  7. chunkText(body, chunkSize, chunkOverlap) → array of { chunkText, chunkIndex, tokenCount, charOffset }.
  8. dbReplaceRagChunks(documentId, sourceType, chunks).
  9. createEmbeddings(chunkTexts, options){ vectors[], modelUsed, dimensions, providerUsed, fallbackUsed }.
  10. For each chunk: dbUpsertRagEmbedding(chunkId, model, dims, vector). Returns: { indexed, skipped, reason?, indexedChunks, embedding{providerUsed, modelUsed, fallbackUsed, fallbackReason, dimensions} }

cosineSimilarity(vectorA, vectorB): number

File: src/analysis/rag-search.js Purpose: Dot-product cosine similarity. Algorithm: dot(A,B) / (|A| × |B|). Returns 0 on length mismatch or zero-norm vectors. Uses element-wise Number() coercion for safety. Notes for rewrite: Pure math, no dependencies. Operates on flat numeric arrays.


applyGraphRerank(results, graph, options): object

File: src/analysis/rag-search.js Purpose: Boost result scores using graph edge signals. Algorithm:

  1. For each graph edge: add relationWeight(relation) to from-node signal, weight×0.8 to to-node signal.
  2. Normalize signals by max signal.
  3. Rerank score: (1−weight)×baseScore + weight×normalizedSignal.
  4. Re-sort descending, count changedRanks. Relation weights:
Relation Weight
action_context 1.00
thought_action 0.85
thought_context 0.85
context_member 0.75
thought_parent 0.50
thought_child 0.50
session_peer 0.35
default 0.25

Notes for rewrite: DEFAULT_GRAPH_RERANK_WEIGHT = 0.2, max 0.75.


buildGraphPayload(results, options): Promise<object>

File: src/analysis/rag-search.js Purpose: Build knowledge graph of linked nodes (actions ↔ contexts ↔ thoughts). Algorithm:

  1. Seed nodes from top results.
  2. Expand edges:
    • action → context (action_context)
    • action → thoughts linked by action_id (thought_action)
    • thought → parent action, parent context, parent/child thoughts
    • context → member actions/thoughts
    • session peers (if include_session)
  3. Deduplicate edges by composite key from|to|relation.
  4. Stop expanding at nodeLimit (default 60, max 500). Returns: { nodeCount, edgeCount, truncated, nodes[], edges[] }

expandWithNeighborChunks(results, windowSize): Promise<object>

File: src/analysis/rag-search.js Purpose: Add adjacent chunks to seed results for context continuity. Algorithm:

  • For each seed chunk, fetch neighbors within window positions via dbGetNeighborRagChunks.
  • Neighbor score = seed.score × 0.85^distance.
  • Dedup by chunkId; keep highest score per chunk.
  • Re-sort descending. Notes for rewrite: Max window = 5. Distance penalty 0.85 per step.

normalizeRankScore(rank, total): number

Returns 1 - rank/(total-1). First result (rank=0) → 1.0, last → 0.0.

getSearchMode(mode): string

Validates to “fts” | “semantic” | “hybrid”. Falls back to “hybrid”.

buildFilters(options): object

Returns { sourceType, sessionId, contextId } from options, normalizing types.


Embedding Configuration

Config key Default Source
DEFAULT_CHUNK_SIZE 1200 chars rag-constants.js
DEFAULT_CHUNK_OVERLAP 200 chars rag-constants.js
MAX_CHUNK_SIZE 6000 rag-constants.js
MAX_CHUNK_OVERLAP 1000 rag-constants.js
DEFAULT_SEARCH_MODE “hybrid” rag-constants.js
DEFAULT_SEMANTIC_WEIGHT 0.7 rag-constants.js
DEFAULT_CANDIDATE_LIMIT 60 rag-constants.js
MAX_SEARCH_CANDIDATES 500 rag-constants.js
MAX_NEIGHBOR_WINDOW 5 rag-constants.js
DEFAULT_GRAPH_RERANK_WEIGHT 0.2 rag-constants.js
MAX_GRAPH_RERANK_WEIGHT 0.75 rag-constants.js
LOCAL_EMBEDDING_MODEL “ams-hash-v2-mp4” rag-constants.js
LOCAL_EMBEDDING_PROJECTIONS 4 magic constants rag-constants.js
queryEmbeddingCache LRU 500 entries, 10 min TTL, 16 MB max rag-constants.js
AMS_EMBEDDING_PROVIDER env var config.js
AMS_EMBEDDING_MODEL env var config.js
AMS_OPENAI_API_KEY env var config.js
AMS_EMBEDDING_STRICT env var config.js

Hybrid Scoring Formula

FTS mode:     finalScore = max(keywordScore, overlapScore)
Semantic mode: finalScore = 0.85 × semanticScore + 0.15 × overlapScore
Hybrid mode:   finalScore = semanticWeight × semanticScore
                          + (1−semanticWeight) × max(keywordScore, overlapScore×0.8)

Where:

  • keywordScore = normalized BM25 rank (1.0 = first FTS hit, 0.0 = last)
  • semanticScore = normalizeCosineScore(cosine(queryVec, chunkVec))
  • overlapScore = token overlap count / query token count

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.