Distributed Deployment
A distributed deployment runs multiple pgEdge Postgres MCP Server instances behind a load balancer. This architecture provides high availability, horizontal scaling, and fault tolerance for production environments.
The MCP server uses file-based configuration for authentication and a SQLite file for the knowledge base. Distributed deployments must ensure that all instances access consistent copies of these files. This guide covers architecture patterns, load balancer configuration, and orchestration strategies for multi-instance deployments.
Architecture Patterns
Two primary patterns support distributed MCP server deployments. The shared filesystem pattern offers simplicity and strong consistency. The object storage pattern provides flexibility for environments without shared volumes.
Shared Filesystem Pattern
The shared filesystem pattern mounts authentication files and the knowledge base database on a shared network volume. All instances read from the same files on the shared volume.
The following diagram illustrates this architecture:
+--------------+ +--------------+ +--------------+
| MCP Server | | MCP Server | | MCP Server |
| Instance 1 | | Instance 2 | | Instance 3 |
+------+-------+ +------+-------+ +------+-------+
| | |
+--------+-------+--------+-------+
|
+--------+--------+
| Shared Volume |
| (NFS/EFS/GCS) |
| - tokens.yaml |
| - users.yaml |
| - kb.db |
+-----------------+
This pattern works well with managed cloud storage services.
Amazon EFS, Google Cloud Filestore, and Azure Files all provide
compatible shared volumes. The server uses fsnotify to detect
file changes; the auto-reload feature picks up updates on shared
volumes without a restart.
The shared filesystem pattern offers the following advantages:
- All instances read the same authentication files.
- Configuration changes propagate to every instance.
- The setup requires minimal additional infrastructure.
- Cloud-managed storage handles replication and durability.
Object Storage Pattern
The object storage pattern stores authentication and knowledge base files in a cloud object store. This approach suits deployments that lack shared filesystem support.
Use Amazon S3, Google Cloud Storage, or Azure Blob Storage to host the configuration files. Init containers or scheduled cron jobs synchronize files from the object store to local disk before the server starts.
This pattern introduces eventual consistency as a trade-off. Updates to authentication files take effect only after the next synchronization cycle. The synchronization interval determines the maximum delay between an update and its propagation.
The following steps describe the object storage workflow:
- Upload the updated
tokens.yamlfile to the object store. - Each instance pulls the file on its synchronization schedule.
- The server detects the local file change and reloads.
Load Balancer Configuration
Bearer token authentication in the MCP server is stateless. Any instance can validate a token independently; sticky sessions are not required. A standard round-robin or least-connections load balancer distributes traffic across all instances.
nginx Configuration
The following nginx configuration distributes requests across
three MCP server instances:
upstream mcp_backend {
server mcp-1:8080;
server mcp-2:8080;
server mcp-3:8080;
}
server {
listen 443 ssl;
server_name mcp.example.com;
ssl_certificate /etc/ssl/certs/mcp.crt;
ssl_certificate_key /etc/ssl/private/mcp.key;
location / {
proxy_pass http://mcp_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For
$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /health {
proxy_pass http://mcp_backend;
}
}
The upstream block defines the pool of backend servers. The
/health endpoint enables load balancer health checks without
authentication.
AWS Application Load Balancer
The following JSON configures health checks for an AWS Application Load Balancer target group:
{
"HealthCheckPath": "/health",
"HealthCheckIntervalSeconds": 30,
"HealthyThresholdCount": 2,
"UnhealthyThresholdCount": 3,
"HealthCheckTimeoutSeconds": 5
}
The load balancer marks an instance as unhealthy after three consecutive failed checks. Two successful checks restore the instance to the healthy state.
Docker Compose Multi-Instance Deployment
Docker Compose provides a straightforward way to run multiple
MCP server instances on a single host. The following example
deploys three instances behind an nginx reverse proxy.
In the following docker-compose.yml file, three MCP server
containers share a configuration volume:
services:
nginx:
image: nginx:alpine
ports:
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./certs:/etc/ssl:ro
depends_on:
- mcp-1
- mcp-2
- mcp-3
mcp-1:
image: pgedge/postgres-mcp:latest
env_file: .env
volumes:
- shared-config:/app/config:ro
mcp-2:
image: pgedge/postgres-mcp:latest
env_file: .env
volumes:
- shared-config:/app/config:ro
mcp-3:
image: pgedge/postgres-mcp:latest
env_file: .env
volumes:
- shared-config:/app/config:ro
volumes:
shared-config:
driver: local
driver_opts:
type: nfs
o: addr=nfs-server,rw
device: ":/exports/mcp-config"
The shared-config volume mounts an NFS share that contains
the authentication YAML files and the knowledge base database.
All three instances read from the same volume. Replace
nfs-server and the device path with your NFS server address
and export path.
Start the deployment with the following command:
docker-compose up -d
Kubernetes Deployment
Kubernetes provides robust orchestration for distributed MCP server deployments. The project includes Helm charts for streamlined installation. This section covers key configuration patterns for Kubernetes environments.
Authentication Files with ConfigMap
Store authentication configuration in a Kubernetes ConfigMap or Secret. The following ConfigMap example defines an API token configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: mcp-auth
data:
tokens.yaml: |
tokens:
- token: "hashed-token-here"
name: "api-client"
Mount the ConfigMap as a volume in the MCP server pod. Use a Kubernetes Secret instead of a ConfigMap for sensitive token data.
Knowledge Base with Init Containers
The knowledge base is a SQLite database file that the server reads at startup. Use an init container to download the file before the main container starts.
In the following example, the init container downloads the knowledge base from a remote URL:
initContainers:
- name: kb-loader
image: busybox
command:
- wget
- "-O"
- /data/kb.db
- "https://storage.example.com/kb.db"
volumeMounts:
- name: kb-data
mountPath: /data
A PersistentVolumeClaim offers an alternative for environments that require durable storage. The init container writes the knowledge base file to the persistent volume on first deployment.
Horizontal Pod Autoscaler
A Horizontal Pod Autoscaler scales the number of MCP server pods based on resource utilization. The following configuration scales between two and ten replicas:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: mcp-server-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: mcp-server
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
The autoscaler adds pods when average CPU utilization exceeds
70 percent. Set minReplicas to at least two for high
availability.
Knowledge Base Synchronization
The knowledge base is a SQLite file that the MCP server reads in read-only mode. Multiple instances can safely read the same file concurrently. Build the knowledge base on a single node or in a CI/CD pipeline; then distribute the file to all instances.
The following strategies distribute the knowledge base:
- Store the file on a shared volume that all instances mount.
- Upload the file to S3 and download with init containers.
- Publish the file as a CI/CD build artifact.
- Use a sidecar container to periodically fetch updates.
Rebuild the knowledge base periodically to incorporate new documentation. After rebuilding, replace the file on shared storage or trigger a new deployment. The server detects the updated file and reloads the knowledge base automatically.
Authentication in a Distributed Context
All instances must read the same token and user YAML files. Consistent authentication data ensures that a token validated on one instance works on every other instance.
The MCP server uses fsnotify to monitor authentication files.
When files change on shared storage, all instances detect the
modification and reload credentials automatically. This reload
happens within 100 milliseconds of the file change.
The following considerations apply to distributed authentication:
- User session tokens validate against the shared user store.
- API tokens work identically across all instances.
- The auto-reload feature propagates credential changes without a restart.
- Active sessions remain valid during a credential reload.
Rate Limiting Considerations
The built-in rate limiter operates per instance. Each instance tracks failed authentication attempts independently. A distributed deployment does not share rate-limiting state across instances.
For centralized rate limiting, configure the reverse proxy to
enforce request limits. An nginx rate-limiting directive or a
cloud load balancer throttling policy provides consistent
enforcement across all instances.
Monitoring and Health Checks
Each MCP server instance exposes a /health endpoint for
health monitoring. Configure the load balancer to poll this
endpoint and remove unhealthy instances from the pool.
The following practices support effective monitoring:
- Use the
/healthendpoint for load balancer health checks. - Aggregate logs from all instances with a centralized system.
- Monitor per-instance CPU, memory, and request metrics.
- Set up alerts for instance failures and high error rates.
Centralized logging platforms such as the ELK stack, AWS CloudWatch, or Google Cloud Logging collect and index logs from all instances. Tag each log entry with the instance identifier to trace requests across the deployment.
Best Practices
The following recommendations apply to distributed MCP server deployments:
- Start with two instances and scale based on observed load.
- Use a shared filesystem for simplicity in cloud deployments.
- Build and test knowledge base updates in a staging environment before production.
- Use centralized logging with ELK, CloudWatch, or Stackdriver.
- Implement graceful shutdown for zero-downtime deployments.
- Pin container image versions for reproducible deployments.
- Store authentication files in Kubernetes Secrets rather than ConfigMaps.
- Configure health check intervals based on your availability requirements.
- Rotate API tokens on a regular schedule across all instances.
- Test failover scenarios to verify high availability.
See Also
The following resources provide additional context for distributed deployments:
- Deploying on Docker covers single-instance Docker deployment.
- Authentication Guide explains token and user authentication in detail.
- Knowledgebase Configuration describes how to configure and use the knowledge base.
- Security Checklist provides security best practices for production deployments.