Claude API: RAG, Context, Messages — Function Reference
Note: There is no src/claude/rag/ directory. The RAG/retrieval functionality is distributed across: context/ (window management + compression), messages/ (indexing + search), tokens/ (budget + counting + optimization), and analytics/ (conversation reporting).
Exported Functions — context/window.js
class ContextWindowManager
constructor(config)
Parameters:
defaultStrategy(WindowStrategy): Strategy for message pruning (default: HYBRID)
apply(messages, options): WindowResult
Purpose: Apply a context window strategy to a message array, fitting within token limits. Parameters:
messages(object[]): Array of{ role, content, tokens? }objects-
strategy(WindowStrategy): FIXEDSLIDING TOKEN_AWARE PRIORITY COMPRESSION HYBRID config(object): Strategy-specific configuration (or auto-created from strategy)maxTokens(number): Token budget (default from strategy config or 100000)
Returns: WindowResult { messages, originalCount, finalCount, originalTokens, finalTokens, tokensDropped, strategy, utilization, actions, warnings }
Constants: WindowStrategy (context/strategies.js)
| Value | Description | |——-|————-| | FIXED | Keep last N messages | | SLIDING | Sliding window from most recent | | TOKEN_AWARE | Fit within token budget greedily | | PRIORITY | Keep high-priority messages (system, recent) | | COMPRESSION | Summarize old messages to save tokens | | HYBRID | Token-aware + priority-based combination |
Exported Functions — context/compressor.js
class ContextCompressor
compress(messages, options): Promise<CompressResult>
Purpose: Compress a message array to reduce token count by summarizing or truncating old messages. Parameters:
-
level(CompressionLevel): NONELIGHT MEDIUM HEAVY AGGRESSIVE targetTokens(number): Target token count after compressionpreserveSystemMessages(boolean): Always keep system messages (default: true)
Returns: { messages, originalTokens, compressedTokens, compressionRatio, actions }
Constants: CompressionLevel
NONE, LIGHT (summarize oldest 20%), MEDIUM (summarize oldest 40%), HEAVY (summarize oldest 60%), AGGRESSIVE (maximum compression)
Exported Functions — context/prioritizer.js
class MessagePrioritizer
prioritize(messages, options): PriorityResult
Purpose: Score messages by importance and sort for context window selection. Parameters:
systemMessageWeight(number): Boost for system role messagesrecentMessageBoost(number): Recency scoring weightkeywordBoosts(object):{word: weight}for domain keywords
Returns: { messages: [{...message, priority_score}], sorted: boolean }
Exported Functions — context/tokens.js
estimateConversationTokens(messages): TokenEstimate
Purpose: Estimate total token count for a message array.
Returns: { total, byRole: {user, assistant, system}, average, max }
Notes for rewrite: Uses ~4 chars/token heuristic. Not accurate for non-English text.
calculateUtilization(tokens, maxTokens): number
Purpose: Return 0–1 utilization ratio.
checkTokenFit(messages, maxTokens): boolean
Purpose: Quick check if messages fit within budget.
Exported Functions — messages/indexer.js
buildInvertedIndex(): Promise<{success, indexed, error?}>
Purpose: Rebuild full-text inverted index from conversation_messages DB table. Processes in 1000-message batches. Stores word → Set<messageId> in memory.
Notes for rewrite: Index is in-memory only; lost on restart. Consider SQLite FTS5 for production.
indexMessage(messageId, content): Promise<{success, words}>
Purpose: Index a single new message into the existing inverted index.
removeFromIndex(messageId): Promise<{success}>
Purpose: Remove all index entries for a message.
searchIndex(query, options): Promise<SearchResult[]>
Purpose: Search inverted index for messages matching query. Parameters:
query(string): Space-separated search termslimit(number): Max results (default: 20)-
operator(string): “and”“or” — whether all or any terms must match
Returns: [{ messageId, score, termMatches }] sorted by score.
getIndexStats(): IndexStats
Returns: { total_messages, indexed_words, last_index_time, is_indexing }
scheduleReindex(delayMs): void
Purpose: Schedule a full index rebuild after a delay.
Exported Functions — messages/search.js
searchMessages(query, options): Promise<MessageSearchResult[]>
Purpose: Full-text search across conversation messages. Parameters:
query(string): Search query-
conversationId(stringnull): Scope to conversation -
role(stringnull): Filter by role -
dateFrom(stringnull): ISO date range start -
dateTo(stringnull): ISO date range end limit(number): Max results (default: 20)offset(number): Pagination
Returns: [{ message_id, conversation_id, role, content_snippet, relevance_score, created_at }]
searchByEmbedding(embedding, options): Promise<MessageSearchResult[]>
Purpose: Semantic search using vector embeddings (if embeddings stored). Notes for rewrite: Placeholder — actual embedding similarity not implemented; falls back to text search.
Exported Functions — messages/store.js
storeMessage(conversationId, message): Promise<MessageRecord>
getMessage(messageId): Promise<MessageRecord|null>
listMessages(conversationId, options): Promise<MessageRecord[]>
Parameters: { limit, offset, role?, dateFrom?, dateTo?, branchId? }
updateMessage(messageId, updates): Promise<MessageRecord|null>
deleteMessage(messageId): Promise<boolean>
getConversationContext(conversationId, options): Promise<ContextResult>
Purpose: Retrieve recent messages within a token budget, optionally applying window strategy. Parameters:
maxTokens(number): Token budgetstrategy(WindowStrategy): Context selection strategyincludeSystem(boolean): Include system messages
Returns: { messages, tokenCount, truncated }
Exported Functions — messages/filters.js
filterByRole(messages, roles): MessageRecord[]
Purpose: Filter messages to specified roles.
filterByDateRange(messages, from, to): MessageRecord[]
filterByTokenCount(messages, maxTokens): MessageRecord[]
Purpose: Trim message list to fit within token budget (FIFO from oldest).
deduplicateMessages(messages): MessageRecord[]
Purpose: Remove duplicate messages by content hash.
Exported Functions — tokens/counter.js
countTokens(text): number
Purpose: Estimate token count for a string.
Notes for rewrite: Heuristic: Math.ceil(text.length / 4). Use tiktoken or Anthropic tokenizer for accuracy.
countMessageTokens(message): number
Purpose: Count tokens for a { role, content } message object including role overhead.
countConversationTokens(messages): ConversationTokenCount
Returns: { total, messages: number[], byRole: {user, assistant, system} }
Exported Functions — tokens/budget.js
class BudgetManager(config)
Parameters:
-
dailyBudget(numbernull): Max tokens per day -
hourlyBudget(numbernull): Max tokens per hour -
perRequestBudget(numbernull): Max tokens per request
recordUsage(usage): UsageResult
Parameters: { model, inputTokens, outputTokens, operation }
Returns: { allowed: boolean, budgetStatus: {daily, hourly, perRequest}, reason? }
getUsageReport(): UsageReport
Returns: { period, total, byModel, byOperation, budgetRemaining }
resetDailyBudget(): void
Exported Functions — tokens/optimizer.js
optimizePrompt(messages, options): Promise<OptimizedPrompt>
Purpose: Reduce token count of message array while preserving meaning. Parameters:
targetTokens(number): Target token budget-
strategy(string): “truncate”“summarize” “filter”
Returns: { messages, originalTokens, optimizedTokens, reduction }
estimateCost(tokens, model): CostEstimate
Purpose: Calculate API cost for a token count.
Returns: { inputCost, outputCost, total, currency: "USD" }
Exported Functions — tokens/strategies.js
applyTokenStrategy(messages, strategy, budget): Promise<MessageRecord[]>
Purpose: Apply a named strategy to fit messages within budget.
Constants: TokenStrategy
AGGRESSIVE_TRUNCATE, SMART_SUMMARIZE, PRIORITY_FILTER, ROLLING_WINDOW
Exported Functions — analytics/reports.js
generateSummaryReport(conversationId): Promise<SummaryReport>
Purpose: Aggregated conversation report: engagement, sentiment, topics, quality, health, key insights, action items. Returns: Comprehensive report object with nested sections.
generateDetailedReport(conversationId): Promise<DetailedReport>
Purpose: Full analytics including turn-taking, response times, token distributions.
generateComparativeReport(conversationIds): Promise<ComparativeReport>
Purpose: Cross-conversation comparison.
exportReport(report, format): string|Buffer
Purpose: Export report as JSON or Markdown.
Parameters: format = “json” | “markdown”
Exported Functions — analytics/conversations.js
calculateEngagementMetrics(conversationId): Promise<EngagementMetrics>
Returns: { total_messages, user_messages, assistant_messages, total_tokens, conversation_duration_minutes, avg_response_time_seconds, messages_per_minute }
analyzeSentiment(conversationId): Promise<SentimentAnalysis>
Returns: { overall_sentiment: positive|negative|neutral, sentiment_progression: [], volatility }
extractTopics(conversationId): Promise<TopicAnalysis>
Returns: { primary_topic, topics: [{name, frequency, weight}] }
getQualityMetrics(conversationId): Promise<QualityMetrics>
Returns: { overall_quality, coherence, relevance, completeness }
analyzeTurnTaking(conversationId): Promise<TurnTakingAnalysis>
Exported Functions — analytics/insights.js
extractKeyInsights(conversationId): Promise<InsightsResult>
Returns: { decisions_made, follow_up_suggestions, key_concepts }
extractActionItems(conversationId): Promise<ActionItemsResult>
Returns: { action_items: [{description, priority, assignee?}] }
getConversationHealth(conversationId): Promise<HealthResult>
Returns: { overall_health: 0-1, recommendations: [] }
Key Data Structures
| Structure | Fields | Purpose |
|---|---|---|
| WindowResult | messages, originalCount, finalCount, originalTokens, finalTokens, strategy, utilization | Context window output |
| MessageRecord | message_id, conversation_id, role, content, tokens, created_at, metadata | DB message entity |
| IndexStats | total_messages, indexed_words, last_index_time, is_indexing | Search index state |
| BudgetManager | dailyBudget, hourlyBudget, perRequestBudget | Token cost management |
External Dependencies
../../db/index.js—getDb(),withTransaction()- No Anthropic SDK calls — local processing layer
Notes for Rewrite
- Inverted index is fully in-memory — SQLite FTS5 tables already exist in schema and should be used instead.
- Token counting uses 4-chars/token heuristic — integrate
@anthropic-ai/tokenizerfor accuracy. searchByEmbeddingis a placeholder — actual vector search (pgvector, SQLite-vss) not implemented.ContextCompressoruses summarization via Claude API internally — ensure API key is available.BudgetManagerresets daily/hourly counters only when explicitly called — no automatic cron reset.- Analytics functions (
calculateEngagementMetrics, etc.) do their own DB queries — could be unified.