Claude API: RAG, Context, Messages — Function Reference

Note: There is no src/claude/rag/ directory. The RAG/retrieval functionality is distributed across: context/ (window management + compression), messages/ (indexing + search), tokens/ (budget + counting + optimization), and analytics/ (conversation reporting).


Exported Functions — context/window.js

class ContextWindowManager

constructor(config)

Parameters:

  • defaultStrategy (WindowStrategy): Strategy for message pruning (default: HYBRID)

apply(messages, options): WindowResult

Purpose: Apply a context window strategy to a message array, fitting within token limits. Parameters:

  • messages (object[]): Array of { role, content, tokens? } objects
  • strategy (WindowStrategy): FIXED SLIDING TOKEN_AWARE PRIORITY COMPRESSION HYBRID
  • config (object): Strategy-specific configuration (or auto-created from strategy)
  • maxTokens (number): Token budget (default from strategy config or 100000)

Returns: WindowResult { messages, originalCount, finalCount, originalTokens, finalTokens, tokensDropped, strategy, utilization, actions, warnings }


Constants: WindowStrategy (context/strategies.js)

| Value | Description | |——-|————-| | FIXED | Keep last N messages | | SLIDING | Sliding window from most recent | | TOKEN_AWARE | Fit within token budget greedily | | PRIORITY | Keep high-priority messages (system, recent) | | COMPRESSION | Summarize old messages to save tokens | | HYBRID | Token-aware + priority-based combination |


Exported Functions — context/compressor.js

class ContextCompressor

compress(messages, options): Promise<CompressResult>

Purpose: Compress a message array to reduce token count by summarizing or truncating old messages. Parameters:

  • level (CompressionLevel): NONE LIGHT MEDIUM HEAVY AGGRESSIVE
  • targetTokens (number): Target token count after compression
  • preserveSystemMessages (boolean): Always keep system messages (default: true)

Returns: { messages, originalTokens, compressedTokens, compressionRatio, actions }

Constants: CompressionLevel

NONE, LIGHT (summarize oldest 20%), MEDIUM (summarize oldest 40%), HEAVY (summarize oldest 60%), AGGRESSIVE (maximum compression)


Exported Functions — context/prioritizer.js

class MessagePrioritizer

prioritize(messages, options): PriorityResult

Purpose: Score messages by importance and sort for context window selection. Parameters:

  • systemMessageWeight (number): Boost for system role messages
  • recentMessageBoost (number): Recency scoring weight
  • keywordBoosts (object): {word: weight} for domain keywords

Returns: { messages: [{...message, priority_score}], sorted: boolean }


Exported Functions — context/tokens.js

estimateConversationTokens(messages): TokenEstimate

Purpose: Estimate total token count for a message array. Returns: { total, byRole: {user, assistant, system}, average, max } Notes for rewrite: Uses ~4 chars/token heuristic. Not accurate for non-English text.

calculateUtilization(tokens, maxTokens): number

Purpose: Return 0–1 utilization ratio.

checkTokenFit(messages, maxTokens): boolean

Purpose: Quick check if messages fit within budget.


Exported Functions — messages/indexer.js

buildInvertedIndex(): Promise<{success, indexed, error?}>

Purpose: Rebuild full-text inverted index from conversation_messages DB table. Processes in 1000-message batches. Stores word → Set<messageId> in memory. Notes for rewrite: Index is in-memory only; lost on restart. Consider SQLite FTS5 for production.

indexMessage(messageId, content): Promise<{success, words}>

Purpose: Index a single new message into the existing inverted index.

removeFromIndex(messageId): Promise<{success}>

Purpose: Remove all index entries for a message.

searchIndex(query, options): Promise<SearchResult[]>

Purpose: Search inverted index for messages matching query. Parameters:

  • query (string): Space-separated search terms
  • limit (number): Max results (default: 20)
  • operator (string): “and” “or” — whether all or any terms must match

Returns: [{ messageId, score, termMatches }] sorted by score.

getIndexStats(): IndexStats

Returns: { total_messages, indexed_words, last_index_time, is_indexing }

scheduleReindex(delayMs): void

Purpose: Schedule a full index rebuild after a delay.


Exported Functions — messages/search.js

searchMessages(query, options): Promise<MessageSearchResult[]>

Purpose: Full-text search across conversation messages. Parameters:

  • query (string): Search query
  • conversationId (string null): Scope to conversation
  • role (string null): Filter by role
  • dateFrom (string null): ISO date range start
  • dateTo (string null): ISO date range end
  • limit (number): Max results (default: 20)
  • offset (number): Pagination

Returns: [{ message_id, conversation_id, role, content_snippet, relevance_score, created_at }]

searchByEmbedding(embedding, options): Promise<MessageSearchResult[]>

Purpose: Semantic search using vector embeddings (if embeddings stored). Notes for rewrite: Placeholder — actual embedding similarity not implemented; falls back to text search.


Exported Functions — messages/store.js

storeMessage(conversationId, message): Promise<MessageRecord>

getMessage(messageId): Promise<MessageRecord|null>

listMessages(conversationId, options): Promise<MessageRecord[]>

Parameters: { limit, offset, role?, dateFrom?, dateTo?, branchId? }

updateMessage(messageId, updates): Promise<MessageRecord|null>

deleteMessage(messageId): Promise<boolean>

getConversationContext(conversationId, options): Promise<ContextResult>

Purpose: Retrieve recent messages within a token budget, optionally applying window strategy. Parameters:

  • maxTokens (number): Token budget
  • strategy (WindowStrategy): Context selection strategy
  • includeSystem (boolean): Include system messages

Returns: { messages, tokenCount, truncated }


Exported Functions — messages/filters.js

filterByRole(messages, roles): MessageRecord[]

Purpose: Filter messages to specified roles.

filterByDateRange(messages, from, to): MessageRecord[]

filterByTokenCount(messages, maxTokens): MessageRecord[]

Purpose: Trim message list to fit within token budget (FIFO from oldest).

deduplicateMessages(messages): MessageRecord[]

Purpose: Remove duplicate messages by content hash.


Exported Functions — tokens/counter.js

countTokens(text): number

Purpose: Estimate token count for a string. Notes for rewrite: Heuristic: Math.ceil(text.length / 4). Use tiktoken or Anthropic tokenizer for accuracy.

countMessageTokens(message): number

Purpose: Count tokens for a { role, content } message object including role overhead.

countConversationTokens(messages): ConversationTokenCount

Returns: { total, messages: number[], byRole: {user, assistant, system} }


Exported Functions — tokens/budget.js

class BudgetManager(config)

Parameters:

  • dailyBudget (number null): Max tokens per day
  • hourlyBudget (number null): Max tokens per hour
  • perRequestBudget (number null): Max tokens per request

recordUsage(usage): UsageResult

Parameters: { model, inputTokens, outputTokens, operation } Returns: { allowed: boolean, budgetStatus: {daily, hourly, perRequest}, reason? }

getUsageReport(): UsageReport

Returns: { period, total, byModel, byOperation, budgetRemaining }

resetDailyBudget(): void


Exported Functions — tokens/optimizer.js

optimizePrompt(messages, options): Promise<OptimizedPrompt>

Purpose: Reduce token count of message array while preserving meaning. Parameters:

  • targetTokens (number): Target token budget
  • strategy (string): “truncate” “summarize” “filter”

Returns: { messages, originalTokens, optimizedTokens, reduction }

estimateCost(tokens, model): CostEstimate

Purpose: Calculate API cost for a token count. Returns: { inputCost, outputCost, total, currency: "USD" }


Exported Functions — tokens/strategies.js

applyTokenStrategy(messages, strategy, budget): Promise<MessageRecord[]>

Purpose: Apply a named strategy to fit messages within budget.

Constants: TokenStrategy

AGGRESSIVE_TRUNCATE, SMART_SUMMARIZE, PRIORITY_FILTER, ROLLING_WINDOW


Exported Functions — analytics/reports.js

generateSummaryReport(conversationId): Promise<SummaryReport>

Purpose: Aggregated conversation report: engagement, sentiment, topics, quality, health, key insights, action items. Returns: Comprehensive report object with nested sections.

generateDetailedReport(conversationId): Promise<DetailedReport>

Purpose: Full analytics including turn-taking, response times, token distributions.

generateComparativeReport(conversationIds): Promise<ComparativeReport>

Purpose: Cross-conversation comparison.

exportReport(report, format): string|Buffer

Purpose: Export report as JSON or Markdown. Parameters: format = “json” | “markdown”


Exported Functions — analytics/conversations.js

calculateEngagementMetrics(conversationId): Promise<EngagementMetrics>

Returns: { total_messages, user_messages, assistant_messages, total_tokens, conversation_duration_minutes, avg_response_time_seconds, messages_per_minute }

analyzeSentiment(conversationId): Promise<SentimentAnalysis>

Returns: { overall_sentiment: positive|negative|neutral, sentiment_progression: [], volatility }

extractTopics(conversationId): Promise<TopicAnalysis>

Returns: { primary_topic, topics: [{name, frequency, weight}] }

getQualityMetrics(conversationId): Promise<QualityMetrics>

Returns: { overall_quality, coherence, relevance, completeness }

analyzeTurnTaking(conversationId): Promise<TurnTakingAnalysis>


Exported Functions — analytics/insights.js

extractKeyInsights(conversationId): Promise<InsightsResult>

Returns: { decisions_made, follow_up_suggestions, key_concepts }

extractActionItems(conversationId): Promise<ActionItemsResult>

Returns: { action_items: [{description, priority, assignee?}] }

getConversationHealth(conversationId): Promise<HealthResult>

Returns: { overall_health: 0-1, recommendations: [] }


Key Data Structures

Structure Fields Purpose
WindowResult messages, originalCount, finalCount, originalTokens, finalTokens, strategy, utilization Context window output
MessageRecord message_id, conversation_id, role, content, tokens, created_at, metadata DB message entity
IndexStats total_messages, indexed_words, last_index_time, is_indexing Search index state
BudgetManager dailyBudget, hourlyBudget, perRequestBudget Token cost management

External Dependencies

  • ../../db/index.jsgetDb(), withTransaction()
  • No Anthropic SDK calls — local processing layer

Notes for Rewrite

  • Inverted index is fully in-memory — SQLite FTS5 tables already exist in schema and should be used instead.
  • Token counting uses 4-chars/token heuristic — integrate @anthropic-ai/tokenizer for accuracy.
  • searchByEmbedding is a placeholder — actual vector search (pgvector, SQLite-vss) not implemented.
  • ContextCompressor uses summarization via Claude API internally — ensure API key is available.
  • BudgetManager resets daily/hourly counters only when explicitly called — no automatic cron reset.
  • Analytics functions (calculateEngagementMetrics, etc.) do their own DB queries — could be unified.

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.