mcp-context-server
An MCP server that provides persistent multimodal context storage for LLM agents.
Package Details
mcp-context-server
Environment Variables
Log level
Storage backend type: sqlite (default) or postgresql
Maximum individual image size in megabytes
Maximum total request size in megabytes
Custom database file location path
Maximum number of concurrent read connections in the pool
Maximum number of concurrent write connections in the pool
Connection timeout in seconds
Idle connection timeout in seconds
Connection health check interval in seconds
Maximum number of retry attempts for failed operations
Base delay in seconds between retry attempts
Maximum delay in seconds between retry attempts
Enable random jitter in retry delays
Exponential backoff multiplication factor for retries
Enable SQLite foreign key constraints
SQLite journal mode (e.g., WAL, DELETE)
SQLite synchronous mode (e.g., NORMAL, FULL, OFF)
SQLite temporary storage location (e.g., MEMORY, FILE)
SQLite memory-mapped I/O size in bytes
SQLite cache size (negative value for KB, positive for pages)
SQLite page size in bytes
SQLite WAL autocheckpoint threshold in pages
SQLite busy timeout in milliseconds
SQLite WAL checkpoint mode (e.g., PASSIVE, FULL, RESTART)
Server shutdown timeout in seconds
Test mode shutdown timeout in seconds
Queue operation timeout in seconds
Test mode queue timeout in seconds
Circuit breaker failure threshold before opening
Circuit breaker recovery timeout in seconds
Maximum calls allowed in circuit breaker half-open state
Complete PostgreSQL connection string (overrides individual settings if provided)
PostgreSQL server host address
PostgreSQL server port number
PostgreSQL database username
PostgreSQL database password
PostgreSQL database name
PostgreSQL connection pool minimum size
PostgreSQL connection pool maximum size
PostgreSQL connection pool timeout in seconds
PostgreSQL command execution timeout in seconds
Timeout in seconds for PostgreSQL migration operations (default: 300)
Close idle PostgreSQL connections after this many seconds (0 to disable, default: 300)
Recycle PostgreSQL connections after this many queries (0 to disable, default: 10000)
Seconds of idle time before sending first TCP keepalive probe (0 to disable, default: 15)
Seconds between subsequent TCP keepalive probes (0 to disable, default: 5)
Number of failed TCP keepalive probes before connection is considered dead (0 to disable, default: 3)
asyncpg prepared statement cache size. Set to 0 for external pooler compatibility (PgBouncer transaction mode, Pgpool-II, etc.). Default: 100
Maximum lifetime of cached prepared statements in seconds (default: 300). Has no effect when statement_cache_size=0
Maximum size of statement to cache in bytes (default: 15360). Has no effect when statement_cache_size=0
PostgreSQL SSL mode (disable, allow, prefer, require, verify-ca, verify-full)
PostgreSQL schema name for table and index operations (default: public)
Enable semantic search functionality
Enable embedding generation for stored context. Default true - server fails if dependencies not met. Set false to disable embeddings.
Ollama API host URL for embedding generation
Automatically pull missing Ollama models on startup (default: true)
Timeout in seconds for pulling Ollama models (default: 900, range: 30-3600)
Ollama embedding truncation mode: false (default) returns error when context exceeded, true enables silent truncation
Ollama embedding context window size in tokens (default: 4096, range: 512-2097152)
Embedding model name for semantic search
Embedding vector dimensions
Timeout in seconds for embedding generation API calls
Maximum number of retry attempts for embedding generation
Base delay in seconds between retry attempts (with exponential backoff)
Maximum concurrent embedding generation operations (default: 3, range: 1-20)
Enable summary generation for stored context. Default true - server fails if dependencies not met. Set false to disable summaries.
Summary provider: ollama (default), openai, or anthropic
Summary generation model name (default: qwen3:0.6b)
Maximum output tokens for summary generation (default: 4000, range: 50-16384). Increase if summaries are truncated by reasoning models
Timeout in seconds for summary generation API calls
Maximum number of retry attempts for summary generation
Base delay in seconds between retry attempts (with exponential backoff)
Maximum concurrent summary generation operations (default: 3, range: 1-20)
Custom summarization prompt. Overrides the built-in default. Used as system message for the LLM.
Minimum text content length in characters to trigger summary generation (default: 500, range: 0-10000). Set to 0 to always generate.
Ollama summary context window size in tokens (default: 32768, range: 512-2097152)
Ollama summary truncation mode: false (default) returns error when context exceeded, true enables silent truncation
Reasoning effort level for OpenAI reasoning models (default: low). Valid values vary by generation: gpt-5: low, medium, high; gpt-5.1+: none, low, medium, high, xhigh. Default low is universally valid across all generations
Effort level for Anthropic Claude models (default: none). Valid values: max, high, medium, low. Controls inference effort (adaptive thinking)
Anthropic API key for summary generation
Enable full-text search functionality
Language for FTS stemming (e.g., english, german, french)
Characters of context around each FTS match for reranking passage extraction (default: 750)
Merge FTS match regions within this character distance (default: 100)
Enable hybrid search combining FTS and semantic search with RRF fusion
RRF smoothing constant for hybrid search (default 60)
Multiplier for over-fetching results before RRF fusion (default: 2)
Minimum significant query terms to switch hybrid FTS from AND to OR logic (default: 4)
Default sort order for search results: relevance (only 'relevance' supported in current version)
Maximum character length for truncated text_content in search results (default: 300, range: 50-1000)
Enable text chunking for embedding generation (default: true)
Target chunk size in characters (default: 1500)
Overlap between chunks in characters (default: 150)
Chunk score aggregation method: max (only 'max' supported in current version)
Multiplier for over-fetching chunks before deduplication (default: 5)
Enable cross-encoder reranking of search results (default: true)
Reranking provider (default: flashrank)
Reranking model name (default: ms-marco-MiniLM-L-12-v2)
Maximum input length for reranking in tokens (default: 512)
Multiplier for over-fetching results before reranking (default: 4)
Directory for caching reranking models
Estimated characters per token for passage size validation (default: 4.0, range: 2.0-8.0)
ONNX Runtime intra-operation parallelism threads for reranking (default: 0 = auto-detect)
Enable ONNX Runtime CPU memory arena for reranking (default: false)
Maximum passages per ONNX Runtime inference batch during reranking (default: 32)
Embedding provider: ollama (default), openai, azure, huggingface, or voyage
OpenAI API key for OpenAI embedding provider
Custom base URL for OpenAI-compatible APIs
OpenAI organization ID
Azure OpenAI API key
Azure OpenAI endpoint URL
Azure OpenAI embedding deployment name
Azure OpenAI API version (default: 2024-02-01)
HuggingFace Hub API token for HuggingFace embedding provider
Voyage AI API key for Voyage embedding provider
Voyage AI truncation mode: false (default) returns error when context exceeded, true enables silent truncation
Voyage AI batch size for embedding requests
Enable LangSmith tracing
LangSmith API key
LangSmith project name
LangSmith API endpoint URL
Comma-separated list of metadata fields to index (field:type format)
Index sync mode: strict (fail), auto (sync), warn (log), additive (default, add missing only)
Transport mode: stdio for local, http for Docker/remote
HTTP bind address (use 0.0.0.0 for Docker)
HTTP port number
Enable stateless HTTP mode for horizontal scaling. Enabled by default as the server has no stateful MCP features. Set to false only if you need server-side MCP session tracking.
Comma-separated list of tools to disable (e.g., delete_context,update_context)
Bearer token for HTTP authentication (required when using SimpleTokenVerifier)
Client ID to assign to authenticated requests
Authentication provider: none (default), simple_token
Custom server instructions text. Overrides built-in default. Set to empty string to disable.
ghcr.io/alex-feel/mcp-context-server:2.2.2
Environment Variables
Log level
Storage backend type: sqlite (default) or postgresql
Maximum individual image size in megabytes
Maximum total request size in megabytes
Custom database file location path
Maximum number of concurrent read connections in the pool
Maximum number of concurrent write connections in the pool
Connection timeout in seconds
Idle connection timeout in seconds
Connection health check interval in seconds
Maximum number of retry attempts for failed operations
Base delay in seconds between retry attempts
Maximum delay in seconds between retry attempts
Enable random jitter in retry delays
Exponential backoff multiplication factor for retries
Enable SQLite foreign key constraints
SQLite journal mode (e.g., WAL, DELETE)
SQLite synchronous mode (e.g., NORMAL, FULL, OFF)
SQLite temporary storage location (e.g., MEMORY, FILE)
SQLite memory-mapped I/O size in bytes
SQLite cache size (negative value for KB, positive for pages)
SQLite page size in bytes
SQLite WAL autocheckpoint threshold in pages
SQLite busy timeout in milliseconds
SQLite WAL checkpoint mode (e.g., PASSIVE, FULL, RESTART)
Server shutdown timeout in seconds
Test mode shutdown timeout in seconds
Queue operation timeout in seconds
Test mode queue timeout in seconds
Circuit breaker failure threshold before opening
Circuit breaker recovery timeout in seconds
Maximum calls allowed in circuit breaker half-open state
Complete PostgreSQL connection string (overrides individual settings if provided)
PostgreSQL server host address
PostgreSQL server port number
PostgreSQL database username
PostgreSQL database password
PostgreSQL database name
PostgreSQL connection pool minimum size
PostgreSQL connection pool maximum size
PostgreSQL connection pool timeout in seconds
PostgreSQL command execution timeout in seconds
Timeout in seconds for PostgreSQL migration operations (default: 300)
Close idle PostgreSQL connections after this many seconds (0 to disable, default: 300)
Recycle PostgreSQL connections after this many queries (0 to disable, default: 10000)
Seconds of idle time before sending first TCP keepalive probe (0 to disable, default: 15)
Seconds between subsequent TCP keepalive probes (0 to disable, default: 5)
Number of failed TCP keepalive probes before connection is considered dead (0 to disable, default: 3)
asyncpg prepared statement cache size. Set to 0 for external pooler compatibility (PgBouncer transaction mode, Pgpool-II, etc.). Default: 100
Maximum lifetime of cached prepared statements in seconds (default: 300). Has no effect when statement_cache_size=0
Maximum size of statement to cache in bytes (default: 15360). Has no effect when statement_cache_size=0
PostgreSQL SSL mode (disable, allow, prefer, require, verify-ca, verify-full)
PostgreSQL schema name for table and index operations (default: public)
Enable semantic search functionality
Enable embedding generation for stored context. Default true - server fails if dependencies not met. Set false to disable embeddings.
Ollama API host URL for embedding generation
Automatically pull missing Ollama models on startup (default: true)
Timeout in seconds for pulling Ollama models (default: 900, range: 30-3600)
Ollama embedding truncation mode: false (default) returns error when context exceeded, true enables silent truncation
Ollama embedding context window size in tokens (default: 4096, range: 512-2097152)
Embedding model name for semantic search
Embedding vector dimensions
Timeout in seconds for embedding generation API calls
Maximum number of retry attempts for embedding generation
Base delay in seconds between retry attempts (with exponential backoff)
Maximum concurrent embedding generation operations (default: 3, range: 1-20)
Enable summary generation for stored context. Default true - server fails if dependencies not met. Set false to disable summaries.
Summary provider: ollama (default), openai, or anthropic
Summary generation model name (default: qwen3:0.6b)
Maximum output tokens for summary generation (default: 4000, range: 50-16384). Increase if summaries are truncated by reasoning models
Timeout in seconds for summary generation API calls
Maximum number of retry attempts for summary generation
Base delay in seconds between retry attempts (with exponential backoff)
Maximum concurrent summary generation operations (default: 3, range: 1-20)
Custom summarization prompt. Overrides the built-in default. Used as system message for the LLM.
Minimum text content length in characters to trigger summary generation (default: 500, range: 0-10000). Set to 0 to always generate.
Ollama summary context window size in tokens (default: 32768, range: 512-2097152)
Ollama summary truncation mode: false (default) returns error when context exceeded, true enables silent truncation
Reasoning effort level for OpenAI reasoning models (default: low). Valid values vary by generation: gpt-5: low, medium, high; gpt-5.1+: none, low, medium, high, xhigh. Default low is universally valid across all generations
Effort level for Anthropic Claude models (default: none). Valid values: max, high, medium, low. Controls inference effort (adaptive thinking)
Anthropic API key for summary generation
Enable full-text search functionality
Language for FTS stemming (e.g., english, german, french)
Characters of context around each FTS match for reranking passage extraction (default: 750)
Merge FTS match regions within this character distance (default: 100)
Enable hybrid search combining FTS and semantic search with RRF fusion
RRF smoothing constant for hybrid search (default 60)
Multiplier for over-fetching results before RRF fusion (default: 2)
Minimum significant query terms to switch hybrid FTS from AND to OR logic (default: 4)
Default sort order for search results: relevance (only 'relevance' supported in current version)
Maximum character length for truncated text_content in search results (default: 300, range: 50-1000)
Enable text chunking for embedding generation (default: true)
Target chunk size in characters (default: 1500)
Overlap between chunks in characters (default: 150)
Chunk score aggregation method: max (only 'max' supported in current version)
Multiplier for over-fetching chunks before deduplication (default: 5)
Enable cross-encoder reranking of search results (default: true)
Reranking provider (default: flashrank)
Reranking model name (default: ms-marco-MiniLM-L-12-v2)
Maximum input length for reranking in tokens (default: 512)
Multiplier for over-fetching results before reranking (default: 4)
Directory for caching reranking models
Estimated characters per token for passage size validation (default: 4.0, range: 2.0-8.0)
ONNX Runtime intra-operation parallelism threads for reranking (default: 0 = auto-detect)
Enable ONNX Runtime CPU memory arena for reranking (default: false)
Maximum passages per ONNX Runtime inference batch during reranking (default: 32)
Embedding provider: ollama (default), openai, azure, huggingface, or voyage
OpenAI API key for OpenAI embedding provider
Custom base URL for OpenAI-compatible APIs
OpenAI organization ID
Azure OpenAI API key
Azure OpenAI endpoint URL
Azure OpenAI embedding deployment name
Azure OpenAI API version (default: 2024-02-01)
HuggingFace Hub API token for HuggingFace embedding provider
Voyage AI API key for Voyage embedding provider
Voyage AI truncation mode: false (default) returns error when context exceeded, true enables silent truncation
Voyage AI batch size for embedding requests
Enable LangSmith tracing
LangSmith API key
LangSmith project name
LangSmith API endpoint URL
Comma-separated list of metadata fields to index (field:type format)
Index sync mode: strict (fail), auto (sync), warn (log), additive (default, add missing only)
Transport mode: stdio for local, http for Docker/remote
HTTP bind address (use 0.0.0.0 for Docker)
HTTP port number
Enable stateless HTTP mode for horizontal scaling. Enabled by default as the server has no stateful MCP features. Set to false only if you need server-side MCP session tracking.
Comma-separated list of tools to disable (e.g., delete_context,update_context)
Bearer token for HTTP authentication (required when using SimpleTokenVerifier)
Client ID to assign to authenticated requests
Authentication provider: none (default), simple_token
Custom server instructions text. Overrides built-in default. Set to empty string to disable.