Health Check API Reference
The 1MCP Agent provides comprehensive health check endpoints for monitoring system status, server connectivity, and operational metrics.
Overview
Health check endpoints are designed for:
- Load Balancers: HAProxy, nginx, AWS ALB
- Container Orchestration: Kubernetes, Docker Swarm
- CI/CD Pipelines: Deployment validation
- Manual Debugging: System status inspection
Security Configuration
The health endpoints include configurable security features to control information exposure:
Detail Levels:
full
: Complete system information with error sanitizationbasic
: Limited system details, sanitized errors, server statusminimal
(default): Only essential status information, no sensitive details
Configuration:
# CLI option (default is minimal)
npx -y @1mcp/agent --config mcp.json --health-info-level basic
# Environment variable
export ONE_MCP_HEALTH_INFO_LEVEL=basic
npx -y @1mcp/agent --config mcp.json
Sanitization Features:
- Automatic error message sanitization
- Credential pattern removal (user:password@, tokens, keys)
- URL and file path redaction
- IP address anonymization
Endpoints
GET /health
Description: Complete health status with detailed system metrics and server information.
Authentication: None required
Response Codes:
200
- Healthy or degraded but functional503
- Unhealthy, critical issues500
- Internal server error
Response Headers:
Content-Type: application/json
Cache-Control: no-cache, no-store, must-revalidate
X-Health-Status: healthy|degraded|unhealthy
X-Service-Version: 0.15.0
X-Uptime-Seconds: 3600
Response Schema:
{
"status": "healthy|degraded|unhealthy",
"timestamp": "2025-01-30T12:00:00.000Z",
"version": "0.15.0",
"system": {
"uptime": 3600,
"memory": {
"used": 50.5,
"total": 100.0,
"percentage": 50.5
},
"process": {
"pid": 12345,
"nodeVersion": "v20.0.0",
"platform": "linux",
"arch": "x64"
}
},
"servers": {
"total": 3,
"healthy": 2,
"unhealthy": 1,
"details": [
{
"name": "filesystem-server",
"status": "connected|error|disconnected|awaiting_oauth",
"healthy": true,
"lastConnected": "2025-01-30T11:30:00.000Z",
"lastError": "Connection timeout",
"tags": ["filesystem", "local"]
}
]
},
"configuration": {
"loaded": true,
"serverCount": 3,
"enabledCount": 2,
"disabledCount": 1,
"authEnabled": true,
"transport": "http"
}
}
Field Descriptions:
status
: Overall health statushealthy
- All systems operationaldegraded
- Some issues but still functionalunhealthy
- Critical issues affecting functionality
system.uptime
: Server uptime in secondssystem.memory.used
: Heap memory used in MBsystem.memory.total
: Total heap memory in MBsystem.memory.percentage
: Memory usage percentageservers.details[].status
: Individual server statusconnected
- Server is connected and operationalerror
- Server has connection or runtime errorsdisconnected
- Server is not connectedawaiting_oauth
- Server requires OAuth authentication
GET /health/live
Description: Simple liveness probe for basic availability checking.
Authentication: None required
Response Codes:
200
- Server is running (always returns 200 if reachable)
Response Schema:
{
"status": "alive",
"timestamp": "2025-01-30T12:00:00.000Z"
}
Use Cases:
- Kubernetes liveness probes
- Load balancer basic health checks
- Service discovery health checks
GET /health/ready
Description: Readiness probe to determine if the service is ready to accept requests.
Authentication: None required
Response Codes:
200
- Service is ready (configuration loaded)503
- Service is not ready
Response Schema:
{
"status": "ready|not_ready",
"timestamp": "2025-01-30T12:00:00.000Z",
"configuration": {
"loaded": true,
"serverCount": 3,
"enabledCount": 2,
"disabledCount": 1,
"authEnabled": true,
"transport": "http"
}
}
Use Cases:
- Kubernetes readiness probes
- Load balancer ready checks
- Deployment validation
Health Status Logic
Overall Status Determination
The overall health status is determined by:
- Configuration Status: Must be loaded for
healthy
ordegraded
- Server Health Ratio:
- All servers healthy →
healthy
- No servers configured →
degraded
50% servers healthy →
degraded
- ≤50% servers healthy →
unhealthy
- All servers healthy →
Server Health Classification
Individual servers are considered healthy if:
- Status is
connected
- No recent critical errors
- Responding to health checks
Rate Limiting
Health endpoints have relaxed rate limiting:
- Window: 5 minutes
- Limit: 200 requests per IP
- Headers: Standard rate limit headers included
Monitoring Integration Examples
Kubernetes Health Probes
apiVersion: apps/v1
kind: Deployment
metadata:
name: 1mcp-agent
spec:
template:
spec:
containers:
- name: 1mcp
image: ghcr.io/1mcp-app/agent:latest
ports:
- containerPort: 3050
livenessProbe:
httpGet:
path: /health/live
port: 3050
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 3050
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
Docker Compose Health Check
version: '3.8'
services:
1mcp:
image: ghcr.io/1mcp-app/agent:latest
ports:
- '3050:3050'
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:3050/health']
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
HAProxy Backend Health Check
backend 1mcp_servers
balance roundrobin
option httpchk GET /health/ready
http-check expect status 200
server 1mcp-1 1mcp-1:3050 check inter 30s
server 1mcp-2 1mcp-2:3050 check inter 30s
Script-Based Monitoring
#!/bin/bash
# Simple health check script
HEALTH_URL="http://localhost:3050/health"
RESPONSE=$(curl -s -w "%{http_code}" -o /tmp/health.json "$HEALTH_URL")
HTTP_CODE="${RESPONSE: -3}"
if [ "$HTTP_CODE" -eq 200 ]; then
STATUS=$(jq -r '.status' /tmp/health.json)
UNHEALTHY=$(jq -r '.servers.unhealthy' /tmp/health.json)
echo "1MCP Health: $STATUS (Unhealthy servers: $UNHEALTHY)"
if [ "$STATUS" = "unhealthy" ]; then
exit 1
fi
else
echo "1MCP Health Check Failed: HTTP $HTTP_CODE"
exit 1
fi
Error Responses
Service Unavailable (503)
{
"status": "unhealthy",
"timestamp": "2025-01-30T12:00:00.000Z",
"version": "0.15.0",
"system": {
/* system info */
},
"servers": {
"total": 3,
"healthy": 0,
"unhealthy": 3,
"details": [
/* server details */
]
},
"configuration": {
"loaded": false,
"serverCount": 0,
"enabledCount": 0,
"disabledCount": 0,
"authEnabled": false,
"transport": "http"
}
}
Internal Server Error (500)
{
"status": "unhealthy",
"timestamp": "2025-01-30T12:00:00.000Z",
"error": "Health check failed",
"message": "Configuration service unavailable"
}
Security Considerations
Production Deployment
Network-Level Protection (Recommended):
- Restrict health endpoints to monitoring networks only
- Use firewall rules to limit access to trusted IPs
- Consider VPN or private network access for detailed endpoints
Information Exposure Control:
- Use
basic
orminimal
detail levels for public-facing deployments full
detail level recommended only for private networks- Error sanitization is always enabled to prevent credential leakage
Example Security Configuration:
# Development environment
npx -y @1mcp/agent --config mcp.json --health-info-level full
# Staging environment
npx -y @1mcp/agent --config mcp.json --health-info-level basic
# Production environment (default minimal level is secure)
npx -y @1mcp/agent --config mcp.json
Detail Level Impact
Level | System Info | Server Details | Error Messages | Use Case |
---|---|---|---|---|
full | Complete | Complete + sanitized | Sanitized | Internal monitoring |
basic | Limited | Status + sanitized | Sanitized | Restricted monitoring |
minimal | Basic only | Counts only | None | Public/Load balancer |
Best Practices
Monitoring Setup
- Use Multiple Endpoints: Combine
/health
,/health/live
, and/health/ready
for comprehensive monitoring - Set Appropriate Timeouts: Health checks should complete within 5-10 seconds
- Configure Retry Logic: Allow 2-3 retries with exponential backoff
- Monitor Trends: Track health status changes over time
- Security-First: Choose appropriate detail level for your network exposure
Development Testing
# Test all health endpoints
curl -v http://localhost:3050/health
curl -v http://localhost:3050/health/live
curl -v http://localhost:3050/health/ready
# Check response headers
curl -I http://localhost:3050/health
# Monitor health status
watch -n 5 'curl -s http://localhost:3050/health | jq ".status, .servers.healthy, .servers.unhealthy"'