RAG Search — Function Reference

⚠ HERITAGE EXTRACTION — donor AMS RAG search (Wave 8 quarantine)

This file extracts the donor AMS RAG/embeddings pipeline from src/analysis/ (deleted R53). The 3-tool rag_* family, the analysis_* family, the OpenAI/local hash embedding fallback, and the mcp_rag_* SQLite tables are donor accretion. Phase 0 Colibri ships no RAG, no embeddings, no semantic search, and no δ Model Router — δ is deferred to Phase 1.5 per ADR-005, and RAG/embedding wiring rides with it. The Phase 0 tool surface is in ../mcp-tools-phase-0.md.

Read this file as donor genealogy only.

Core Algorithm

Three-phase pipeline: Index → Search → Rerank.

Index Phase (`ragIndex` / `indexRagSourceDocument`)

body text
  → chunkText(body, chunkSize, chunkOverlap)   # sliding window chunks
  → dbReplaceRagChunks(documentId, chunks)     # store chunks in SQLite
  → createEmbeddings(chunkTexts, options)       # OpenAI or local hash fallback
  → dbUpsertRagEmbedding(chunkId, model, dims, vector)

Search Phase (`ragSearch`)

query string
  → dbWaitForRagIndexQueue(waitMs)              # flush pending index writes
  → dbSearchRagChunks(query, candidateLimit)    # FTS5 BM25 candidates
  → createQueryEmbedding(query)                 # embed query
  → dbGetRagEmbeddingsByChunkIds(chunkIds)      # load stored vectors
  → score each candidate (FTS + cosine + overlap)
  → quality filter (minScore, minSemanticScore, minTokenOverlap)
  → applySourceDiversification(maxPerSource)
  → expandWithNeighborChunks(window)            # context window expansion
  → buildGraphPayload(results)                  # knowledge graph
  → applyGraphRerank(results, graph)            # graph-signal reranking

Exported Functions

`ragSearch(options)`: `Promise<object>`

File: src/analysis/rag-search.js Purpose: Main search entry point. Hybrid FTS + semantic + graph reranking. Algorithm:

dbWaitForRagIndexQueue(waitForIndexMs) — up to 10000 ms.
Fetch FTS candidates: dbSearchRagChunks(query, candidateLimit) (BM25 ranked).
If mode ≠ “fts” and no candidates: fallback to dbListRagChunks(candidateLimit).
Build keywordScores map: normalizeRankScore(rank, total) = 1 - rank/(total-1).
If mode ≠ “fts”: createQueryEmbedding(query) → load stored embeddings.
Score each candidate:
- overlapScore = countTokenOverlap(queryTokens, candidate) / queryTokens.length
- semanticScore = normalizeCosineScore(cosineSimilarity(queryVec, chunkVec))
- FTS mode: finalScore = max(keywordScore, overlapScore)
- Semantic mode: finalScore = 0.85×semantic + 0.15×overlap
- Hybrid mode: finalScore = semanticWeight×semantic + keywordWeight×max(keyword, overlap×0.8)
Filter: score >= minScore, semanticScore >= minSemanticScore, overlapTokens >= minTokenOverlap.
applySourceDiversification(qualityFiltered, diversifyBySource, maxPerSource).
Take top limit → seed results.
expandWithNeighborChunks(seedResults, neighborWindow).
buildGraphPayload(expansion.results, options).
applyGraphRerank(expansion.results, graph, { enabled, weight }). Parameters:

Param	Default	Range
`query`	required	string
`mode`	“hybrid”	“fts” / “semantic” / “hybrid”
`limit`	10	1–100
`candidate_limit`	max(60, limit×6)	limit–500
`semantic_weight`	0.7	0–1
`neighbor_window`	0	0–5
`min_score`	mode-dependent	float
`min_semantic_score`	mode-dependent	float
`min_token_overlap`	mode-dependent	int
`diversify_by_source`	true	bool
`max_per_source`	3	1–20
`include_graph`	true	bool
`graph_rerank`	false	bool
`graph_rerank_weight`	0.2	0–0.75
`wait_for_index_ms`	1000	0–10000

Returns: { success, query, mode, limit, candidateLimit, weights, quality{…}, neighbor{…}, rerank{…}, degraded, qualityTier, embedding{…}, filters, count, results[], graph }

`ragIndex(options)`: `Promise<object>`

File: src/analysis/rag-search.js Purpose: Batch indexing of context, action, and thought records. Algorithm:

Fetch rows per source_types (default: context, action, thought) up to limit (1–5000).
For each row, call the appropriate index*Record() wrapper.
Skip if content_unchanged (hash comparison) and not force=true.
Chunk, embed, store.
Aggregate embedding summary (providers, models, fallback count).
Return quality tier: “high” (OpenAI, no fallback), “local” (local-only), “degraded” (fallback used). Chunk config: chunk_size default 1200, overlap default 200, max chunk 6000, max overlap 1000. Parallelism: parallel_batches (1–32), checkpoint_every (1–10000).

`indexRagSourceDocument(source, options)`: `Promise<object>`

File: src/analysis/rag-search.js Purpose: Index a single document with chunking and embedding. Algorithm:

Validate sourceType ∈ ["context", "action", "thought"].
normalizeSourceId(sourceId) → non-negative integer.
normalizeToText(body) → string; skip if empty.
sha256(body) → contentHash.
dbGetRagDocumentBySource() → check existing; if unchanged and !force, return skipped.
dbUpsertRagDocument() → upsert record.
chunkText(body, chunkSize, chunkOverlap) → array of { chunkText, chunkIndex, tokenCount, charOffset }.
dbReplaceRagChunks(documentId, sourceType, chunks).
createEmbeddings(chunkTexts, options) → { vectors[], modelUsed, dimensions, providerUsed, fallbackUsed }.
For each chunk: dbUpsertRagEmbedding(chunkId, model, dims, vector). Returns: { indexed, skipped, reason?, indexedChunks, embedding{providerUsed, modelUsed, fallbackUsed, fallbackReason, dimensions} }

`cosineSimilarity(vectorA, vectorB)`: `number`

File: src/analysis/rag-search.js Purpose: Dot-product cosine similarity. Algorithm: dot(A,B) / (|A| × |B|). Returns 0 on length mismatch or zero-norm vectors. Uses element-wise Number() coercion for safety. Notes for rewrite: Pure math, no dependencies. Operates on flat numeric arrays.

`applyGraphRerank(results, graph, options)`: `object`

File: src/analysis/rag-search.js Purpose: Boost result scores using graph edge signals. Algorithm:

For each graph edge: add relationWeight(relation) to from-node signal, weight×0.8 to to-node signal.
Normalize signals by max signal.
Rerank score: (1−weight)×baseScore + weight×normalizedSignal.
Re-sort descending, count changedRanks. Relation weights:

Relation	Weight
action_context	1.00
thought_action	0.85
thought_context	0.85
context_member	0.75
thought_parent	0.50
thought_child	0.50
session_peer	0.35
default	0.25

Notes for rewrite: DEFAULT_GRAPH_RERANK_WEIGHT = 0.2, max 0.75.

`buildGraphPayload(results, options)`: `Promise<object>`

File: src/analysis/rag-search.js Purpose: Build knowledge graph of linked nodes (actions ↔ contexts ↔ thoughts). Algorithm:

Seed nodes from top results.
Expand edges:
- action → context (action_context)
- action → thoughts linked by action_id (thought_action)
- thought → parent action, parent context, parent/child thoughts
- context → member actions/thoughts
- session peers (if include_session)
Deduplicate edges by composite key from|to|relation.
Stop expanding at nodeLimit (default 60, max 500). Returns: { nodeCount, edgeCount, truncated, nodes[], edges[] }

`expandWithNeighborChunks(results, windowSize)`: `Promise<object>`

File: src/analysis/rag-search.js Purpose: Add adjacent chunks to seed results for context continuity. Algorithm:

For each seed chunk, fetch neighbors within window positions via dbGetNeighborRagChunks.
Neighbor score = seed.score × 0.85^distance.
Dedup by chunkId; keep highest score per chunk.
Re-sort descending. Notes for rewrite: Max window = 5. Distance penalty 0.85 per step.

`normalizeRankScore(rank, total)`: `number`

Returns 1 - rank/(total-1). First result (rank=0) → 1.0, last → 0.0.

`getSearchMode(mode)`: `string`

Validates to “fts” | “semantic” | “hybrid”. Falls back to “hybrid”.

`buildFilters(options)`: `object`

Returns { sourceType, sessionId, contextId } from options, normalizing types.

Embedding Configuration

Config key	Default	Source
`DEFAULT_CHUNK_SIZE`	1200 chars	rag-constants.js
`DEFAULT_CHUNK_OVERLAP`	200 chars	rag-constants.js
`MAX_CHUNK_SIZE`	6000	rag-constants.js
`MAX_CHUNK_OVERLAP`	1000	rag-constants.js
`DEFAULT_SEARCH_MODE`	“hybrid”	rag-constants.js
`DEFAULT_SEMANTIC_WEIGHT`	0.7	rag-constants.js
`DEFAULT_CANDIDATE_LIMIT`	60	rag-constants.js
`MAX_SEARCH_CANDIDATES`	500	rag-constants.js
`MAX_NEIGHBOR_WINDOW`	5	rag-constants.js
`DEFAULT_GRAPH_RERANK_WEIGHT`	0.2	rag-constants.js
`MAX_GRAPH_RERANK_WEIGHT`	0.75	rag-constants.js
`LOCAL_EMBEDDING_MODEL`	“ams-hash-v2-mp4”	rag-constants.js
`LOCAL_EMBEDDING_PROJECTIONS`	4 magic constants	rag-constants.js
`queryEmbeddingCache`	LRU 500 entries, 10 min TTL, 16 MB max	rag-constants.js
`AMS_EMBEDDING_PROVIDER`	env var	config.js
`AMS_EMBEDDING_MODEL`	env var	config.js
`AMS_OPENAI_API_KEY`	env var	config.js
`AMS_EMBEDDING_STRICT`	env var	config.js

Hybrid Scoring Formula

FTS mode:     finalScore = max(keywordScore, overlapScore)
Semantic mode: finalScore = 0.85 × semanticScore + 0.15 × overlapScore
Hybrid mode:   finalScore = semanticWeight × semanticScore
                          + (1−semanticWeight) × max(keywordScore, overlapScore×0.8)

Where:

keywordScore = normalized BM25 rank (1.0 = first FTS hit, 0.0 = last)
semanticScore = normalizeCosineScore(cosine(queryVec, chunkVec))
overlapScore = token overlap count / query token count

RAG Search — Function Reference

⚠ HERITAGE EXTRACTION — donor AMS RAG search (Wave 8 quarantine)

Core Algorithm

Index Phase (ragIndex / indexRagSourceDocument)

Search Phase (ragSearch)

Exported Functions

ragSearch(options): Promise<object>

ragIndex(options): Promise<object>

indexRagSourceDocument(source, options): Promise<object>

cosineSimilarity(vectorA, vectorB): number

applyGraphRerank(results, graph, options): object

buildGraphPayload(results, options): Promise<object>

expandWithNeighborChunks(results, windowSize): Promise<object>

normalizeRankScore(rank, total): number

getSearchMode(mode): string

buildFilters(options): object