RAG Search — Function Reference
⚠ HERITAGE EXTRACTION — donor AMS RAG search (Wave 8 quarantine)
This file extracts the donor AMS RAG/embeddings pipeline from
src/analysis/(deleted R53). The 3-toolrag_*family, theanalysis_*family, the OpenAI/local hash embedding fallback, and themcp_rag_*SQLite tables are donor accretion. Phase 0 Colibri ships no RAG, no embeddings, no semantic search, and noδ Model Router— δ is deferred to Phase 1.5 per ADR-005, and RAG/embedding wiring rides with it. The Phase 0 tool surface is in../mcp-tools-phase-0.md.Read this file as donor genealogy only.
Core Algorithm
Three-phase pipeline: Index → Search → Rerank.
Index Phase (ragIndex / indexRagSourceDocument)
body text
→ chunkText(body, chunkSize, chunkOverlap) # sliding window chunks
→ dbReplaceRagChunks(documentId, chunks) # store chunks in SQLite
→ createEmbeddings(chunkTexts, options) # OpenAI or local hash fallback
→ dbUpsertRagEmbedding(chunkId, model, dims, vector)
Search Phase (ragSearch)
query string
→ dbWaitForRagIndexQueue(waitMs) # flush pending index writes
→ dbSearchRagChunks(query, candidateLimit) # FTS5 BM25 candidates
→ createQueryEmbedding(query) # embed query
→ dbGetRagEmbeddingsByChunkIds(chunkIds) # load stored vectors
→ score each candidate (FTS + cosine + overlap)
→ quality filter (minScore, minSemanticScore, minTokenOverlap)
→ applySourceDiversification(maxPerSource)
→ expandWithNeighborChunks(window) # context window expansion
→ buildGraphPayload(results) # knowledge graph
→ applyGraphRerank(results, graph) # graph-signal reranking
Exported Functions
ragSearch(options): Promise<object>
File: src/analysis/rag-search.js
Purpose: Main search entry point. Hybrid FTS + semantic + graph reranking.
Algorithm:
dbWaitForRagIndexQueue(waitForIndexMs)— up to 10000 ms.- Fetch FTS candidates:
dbSearchRagChunks(query, candidateLimit)(BM25 ranked). - If mode ≠ “fts” and no candidates: fallback to
dbListRagChunks(candidateLimit). - Build
keywordScoresmap:normalizeRankScore(rank, total)=1 - rank/(total-1). - If mode ≠ “fts”:
createQueryEmbedding(query)→ load stored embeddings. - Score each candidate:
overlapScore=countTokenOverlap(queryTokens, candidate) / queryTokens.lengthsemanticScore=normalizeCosineScore(cosineSimilarity(queryVec, chunkVec))- FTS mode:
finalScore = max(keywordScore, overlapScore) - Semantic mode:
finalScore = 0.85×semantic + 0.15×overlap - Hybrid mode:
finalScore = semanticWeight×semantic + keywordWeight×max(keyword, overlap×0.8)
- Filter:
score >= minScore,semanticScore >= minSemanticScore,overlapTokens >= minTokenOverlap. applySourceDiversification(qualityFiltered, diversifyBySource, maxPerSource).- Take top
limit→ seed results. expandWithNeighborChunks(seedResults, neighborWindow).buildGraphPayload(expansion.results, options).applyGraphRerank(expansion.results, graph, { enabled, weight }). Parameters:
| Param | Default | Range |
|---|---|---|
query |
required | string |
mode |
“hybrid” | “fts” / “semantic” / “hybrid” |
limit |
10 | 1–100 |
candidate_limit |
max(60, limit×6) | limit–500 |
semantic_weight |
0.7 | 0–1 |
neighbor_window |
0 | 0–5 |
min_score |
mode-dependent | float |
min_semantic_score |
mode-dependent | float |
min_token_overlap |
mode-dependent | int |
diversify_by_source |
true | bool |
max_per_source |
3 | 1–20 |
include_graph |
true | bool |
graph_rerank |
false | bool |
graph_rerank_weight |
0.2 | 0–0.75 |
wait_for_index_ms |
1000 | 0–10000 |
Returns: { success, query, mode, limit, candidateLimit, weights, quality{…}, neighbor{…}, rerank{…}, degraded, qualityTier, embedding{…}, filters, count, results[], graph }
ragIndex(options): Promise<object>
File: src/analysis/rag-search.js
Purpose: Batch indexing of context, action, and thought records.
Algorithm:
- Fetch rows per
source_types(default: context, action, thought) up tolimit(1–5000). - For each row, call the appropriate
index*Record()wrapper. - Skip if
content_unchanged(hash comparison) and notforce=true. - Chunk, embed, store.
- Aggregate embedding summary (providers, models, fallback count).
- Return quality tier: “high” (OpenAI, no fallback), “local” (local-only), “degraded” (fallback used).
Chunk config:
chunk_sizedefault 1200, overlap default 200, max chunk 6000, max overlap 1000. Parallelism:parallel_batches(1–32),checkpoint_every(1–10000).
indexRagSourceDocument(source, options): Promise<object>
File: src/analysis/rag-search.js
Purpose: Index a single document with chunking and embedding.
Algorithm:
- Validate
sourceType∈["context", "action", "thought"]. normalizeSourceId(sourceId)→ non-negative integer.normalizeToText(body)→ string; skip if empty.sha256(body)→contentHash.dbGetRagDocumentBySource()→ check existing; if unchanged and!force, return skipped.dbUpsertRagDocument()→ upsert record.chunkText(body, chunkSize, chunkOverlap)→ array of{ chunkText, chunkIndex, tokenCount, charOffset }.dbReplaceRagChunks(documentId, sourceType, chunks).createEmbeddings(chunkTexts, options)→{ vectors[], modelUsed, dimensions, providerUsed, fallbackUsed }.- For each chunk:
dbUpsertRagEmbedding(chunkId, model, dims, vector). Returns:{ indexed, skipped, reason?, indexedChunks, embedding{providerUsed, modelUsed, fallbackUsed, fallbackReason, dimensions} }
cosineSimilarity(vectorA, vectorB): number
File: src/analysis/rag-search.js
Purpose: Dot-product cosine similarity.
Algorithm: dot(A,B) / (|A| × |B|). Returns 0 on length mismatch or zero-norm vectors. Uses element-wise Number() coercion for safety.
Notes for rewrite: Pure math, no dependencies. Operates on flat numeric arrays.
applyGraphRerank(results, graph, options): object
File: src/analysis/rag-search.js
Purpose: Boost result scores using graph edge signals.
Algorithm:
- For each graph edge: add
relationWeight(relation)tofrom-node signal,weight×0.8toto-node signal. - Normalize signals by max signal.
- Rerank score:
(1−weight)×baseScore + weight×normalizedSignal. - Re-sort descending, count
changedRanks. Relation weights:
| Relation | Weight |
|---|---|
| action_context | 1.00 |
| thought_action | 0.85 |
| thought_context | 0.85 |
| context_member | 0.75 |
| thought_parent | 0.50 |
| thought_child | 0.50 |
| session_peer | 0.35 |
| default | 0.25 |
Notes for rewrite: DEFAULT_GRAPH_RERANK_WEIGHT = 0.2, max 0.75.
buildGraphPayload(results, options): Promise<object>
File: src/analysis/rag-search.js
Purpose: Build knowledge graph of linked nodes (actions ↔ contexts ↔ thoughts).
Algorithm:
- Seed nodes from top results.
- Expand edges:
- action → context (
action_context) - action → thoughts linked by
action_id(thought_action) - thought → parent action, parent context, parent/child thoughts
- context → member actions/thoughts
- session peers (if
include_session)
- action → context (
- Deduplicate edges by composite key
from|to|relation. - Stop expanding at
nodeLimit(default 60, max 500). Returns:{ nodeCount, edgeCount, truncated, nodes[], edges[] }
expandWithNeighborChunks(results, windowSize): Promise<object>
File: src/analysis/rag-search.js
Purpose: Add adjacent chunks to seed results for context continuity.
Algorithm:
- For each seed chunk, fetch neighbors within
windowpositions viadbGetNeighborRagChunks. - Neighbor score =
seed.score × 0.85^distance. - Dedup by
chunkId; keep highest score per chunk. - Re-sort descending. Notes for rewrite: Max window = 5. Distance penalty 0.85 per step.
normalizeRankScore(rank, total): number
Returns 1 - rank/(total-1). First result (rank=0) → 1.0, last → 0.0.
getSearchMode(mode): string
Validates to “fts” | “semantic” | “hybrid”. Falls back to “hybrid”.
buildFilters(options): object
Returns { sourceType, sessionId, contextId } from options, normalizing types.
Embedding Configuration
| Config key | Default | Source |
|---|---|---|
DEFAULT_CHUNK_SIZE |
1200 chars | rag-constants.js |
DEFAULT_CHUNK_OVERLAP |
200 chars | rag-constants.js |
MAX_CHUNK_SIZE |
6000 | rag-constants.js |
MAX_CHUNK_OVERLAP |
1000 | rag-constants.js |
DEFAULT_SEARCH_MODE |
“hybrid” | rag-constants.js |
DEFAULT_SEMANTIC_WEIGHT |
0.7 | rag-constants.js |
DEFAULT_CANDIDATE_LIMIT |
60 | rag-constants.js |
MAX_SEARCH_CANDIDATES |
500 | rag-constants.js |
MAX_NEIGHBOR_WINDOW |
5 | rag-constants.js |
DEFAULT_GRAPH_RERANK_WEIGHT |
0.2 | rag-constants.js |
MAX_GRAPH_RERANK_WEIGHT |
0.75 | rag-constants.js |
LOCAL_EMBEDDING_MODEL |
“ams-hash-v2-mp4” | rag-constants.js |
LOCAL_EMBEDDING_PROJECTIONS |
4 magic constants | rag-constants.js |
queryEmbeddingCache |
LRU 500 entries, 10 min TTL, 16 MB max | rag-constants.js |
AMS_EMBEDDING_PROVIDER |
env var | config.js |
AMS_EMBEDDING_MODEL |
env var | config.js |
AMS_OPENAI_API_KEY |
env var | config.js |
AMS_EMBEDDING_STRICT |
env var | config.js |
Hybrid Scoring Formula
FTS mode: finalScore = max(keywordScore, overlapScore)
Semantic mode: finalScore = 0.85 × semanticScore + 0.15 × overlapScore
Hybrid mode: finalScore = semanticWeight × semanticScore
+ (1−semanticWeight) × max(keywordScore, overlapScore×0.8)
Where:
keywordScore= normalized BM25 rank (1.0 = first FTS hit, 0.0 = last)semanticScore=normalizeCosineScore(cosine(queryVec, chunkVec))overlapScore= token overlap count / query token count