Health Check API Reference

The 1MCP Agent provides comprehensive health check endpoints for monitoring system status, server connectivity, and operational metrics.

Overview

Health check endpoints are designed for:

Load Balancers: HAProxy, nginx, AWS ALB
Container Orchestration: Kubernetes, Docker Swarm
CI/CD Pipelines: Deployment validation
Manual Debugging: System status inspection

Security Configuration

The health endpoints include configurable security features to control information exposure:

Detail Levels:

full: Complete system information with error sanitization
basic: Limited system details, sanitized errors, server status
minimal (default): Only essential status information, no sensitive details

Configuration:

bash

# CLI option (default is minimal)
npx -y @1mcp/agent --config mcp.json --health-info-level basic

# Environment variable
export ONE_MCP_HEALTH_INFO_LEVEL=basic
npx -y @1mcp/agent --config mcp.json

Sanitization Features:

Automatic error message sanitization
Credential pattern removal (user:password@, tokens, keys)
URL and file path redaction
IP address anonymization

Endpoints

`GET /health`

Description: Complete health status with detailed system metrics and server information.

Authentication: None required

Response Codes:

200 - Healthy or degraded but functional
503 - Unhealthy, critical issues
500 - Internal server error

Response Headers:

http

Content-Type: application/json
Cache-Control: no-cache, no-store, must-revalidate
X-Health-Status: healthy|degraded|unhealthy
X-Service-Version: 0.15.0
X-Uptime-Seconds: 3600

Response Schema:

json

{
  "status": "healthy|degraded|unhealthy",
  "timestamp": "2025-01-30T12:00:00.000Z",
  "version": "0.15.0",
  "system": {
    "uptime": 3600,
    "memory": {
      "used": 50.5,
      "total": 100.0,
      "percentage": 50.5
    },
    "process": {
      "pid": 12345,
      "nodeVersion": "v20.0.0",
      "platform": "linux",
      "arch": "x64"
    }
  },
  "servers": {
    "total": 3,
    "healthy": 2,
    "unhealthy": 1,
    "details": [
      {
        "name": "filesystem-server",
        "status": "connected|error|disconnected|awaiting_oauth",
        "healthy": true,
        "lastConnected": "2025-01-30T11:30:00.000Z",
        "lastError": "Connection timeout",
        "tags": ["filesystem", "local"]
      }
    ]
  },
  "configuration": {
    "loaded": true,
    "serverCount": 3,
    "enabledCount": 2,
    "disabledCount": 1,
    "authEnabled": true,
    "transport": "http"
  }
}

Field Descriptions:

status: Overall health status
- healthy - All systems operational
- degraded - Some issues but still functional
- unhealthy - Critical issues affecting functionality
system.uptime: Server uptime in seconds
system.memory.used: Heap memory used in MB
system.memory.total: Total heap memory in MB
system.memory.percentage: Memory usage percentage
servers.details[].status: Individual server status
- connected - Server is connected and operational
- error - Server has connection or runtime errors
- disconnected - Server is not connected
- awaiting_oauth - Server requires OAuth authentication

`GET /health/live`

Description: Simple liveness probe for basic availability checking.

Authentication: None required

Response Codes:

200 - Server is running (always returns 200 if reachable)

Response Schema:

json

{
  "status": "alive",
  "timestamp": "2025-01-30T12:00:00.000Z"
}

Use Cases:

Kubernetes liveness probes
Load balancer basic health checks
Service discovery health checks

`GET /health/ready`

Description: Readiness probe to determine if the service is ready to accept requests.

Authentication: None required

Response Codes:

200 - Service is ready (configuration loaded)
503 - Service is not ready

Response Schema:

json

{
  "status": "ready|not_ready",
  "timestamp": "2025-01-30T12:00:00.000Z",
  "configuration": {
    "loaded": true,
    "serverCount": 3,
    "enabledCount": 2,
    "disabledCount": 1,
    "authEnabled": true,
    "transport": "http"
  }
}

Use Cases:

Kubernetes readiness probes
Load balancer ready checks
Deployment validation

Health Status Logic

Overall Status Determination

The overall health status is determined by:

Configuration Status: Must be loaded for healthy or degraded
Server Health Ratio:
- All servers healthy → healthy
- No servers configured → degraded
- 50% servers healthy → degraded
- ≤50% servers healthy → unhealthy

Server Health Classification

Individual servers are considered healthy if:

Status is connected
No recent critical errors
Responding to health checks

Rate Limiting

Health endpoints have relaxed rate limiting:

Window: 5 minutes
Limit: 200 requests per IP
Headers: Standard rate limit headers included

Monitoring Integration Examples

Kubernetes Health Probes

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: 1mcp-agent
spec:
  template:
    spec:
      containers:
        - name: 1mcp
          image: ghcr.io/1mcp-app/agent:latest
          ports:
            - containerPort: 3050
          livenessProbe:
            httpGet:
              path: /health/live
              port: 3050
            initialDelaySeconds: 30
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 3050
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 3
            failureThreshold: 3

Docker Compose Health Check

yaml

version: '3.8'
services:
  1mcp:
    image: ghcr.io/1mcp-app/agent:latest
    ports:
      - '3050:3050'
    healthcheck:
      test: ['CMD', 'curl', '-f', 'http://localhost:3050/health']
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

HAProxy Backend Health Check

backend 1mcp_servers
    balance roundrobin
    option httpchk GET /health/ready
    http-check expect status 200
    server 1mcp-1 1mcp-1:3050 check inter 30s
    server 1mcp-2 1mcp-2:3050 check inter 30s

Script-Based Monitoring

bash

#!/bin/bash
# Simple health check script

HEALTH_URL="http://localhost:3050/health"
RESPONSE=$(curl -s -w "%{http_code}" -o /tmp/health.json "$HEALTH_URL")
HTTP_CODE="${RESPONSE: -3}"

if [ "$HTTP_CODE" -eq 200 ]; then
    STATUS=$(jq -r '.status' /tmp/health.json)
    UNHEALTHY=$(jq -r '.servers.unhealthy' /tmp/health.json)

    echo "1MCP Health: $STATUS (Unhealthy servers: $UNHEALTHY)"

    if [ "$STATUS" = "unhealthy" ]; then
        exit 1
    fi
else
    echo "1MCP Health Check Failed: HTTP $HTTP_CODE"
    exit 1
fi

Error Responses

Service Unavailable (503)

json

{
  "status": "unhealthy",
  "timestamp": "2025-01-30T12:00:00.000Z",
  "version": "0.15.0",
  "system": {
    /* system info */
  },
  "servers": {
    "total": 3,
    "healthy": 0,
    "unhealthy": 3,
    "details": [
      /* server details */
    ]
  },
  "configuration": {
    "loaded": false,
    "serverCount": 0,
    "enabledCount": 0,
    "disabledCount": 0,
    "authEnabled": false,
    "transport": "http"
  }
}

Internal Server Error (500)

json

{
  "status": "unhealthy",
  "timestamp": "2025-01-30T12:00:00.000Z",
  "error": "Health check failed",
  "message": "Configuration service unavailable"
}

Security Considerations

Production Deployment

Network-Level Protection (Recommended):

Restrict health endpoints to monitoring networks only
Use firewall rules to limit access to trusted IPs
Consider VPN or private network access for detailed endpoints

Information Exposure Control:

Use basic or minimal detail levels for public-facing deployments
full detail level recommended only for private networks
Error sanitization is always enabled to prevent credential leakage

Example Security Configuration:

bash

# Development environment
npx -y @1mcp/agent --config mcp.json --health-info-level full

# Staging environment
npx -y @1mcp/agent --config mcp.json --health-info-level basic

# Production environment (default minimal level is secure)
npx -y @1mcp/agent --config mcp.json

Detail Level Impact

Level	System Info	Server Details	Error Messages	Use Case
`full`	Complete	Complete + sanitized	Sanitized	Internal monitoring
`basic`	Limited	Status + sanitized	Sanitized	Restricted monitoring
`minimal`	Basic only	Counts only	None	Public/Load balancer

Best Practices

Monitoring Setup

Use Multiple Endpoints: Combine /health, /health/live, and /health/ready for comprehensive monitoring
Set Appropriate Timeouts: Health checks should complete within 5-10 seconds
Configure Retry Logic: Allow 2-3 retries with exponential backoff
Monitor Trends: Track health status changes over time
Security-First: Choose appropriate detail level for your network exposure

Development Testing

bash

# Test all health endpoints
curl -v http://localhost:3050/health
curl -v http://localhost:3050/health/live
curl -v http://localhost:3050/health/ready

# Check response headers
curl -I http://localhost:3050/health

# Monitor health status
watch -n 5 'curl -s http://localhost:3050/health | jq ".status, .servers.healthy, .servers.unhealthy"'

Health Check API Reference ​

Overview ​

Security Configuration ​

Endpoints ​

GET /health ​

GET /health/live ​

GET /health/ready ​

Health Status Logic ​

Overall Status Determination ​

Server Health Classification ​

Rate Limiting ​

Monitoring Integration Examples ​

Kubernetes Health Probes ​

Docker Compose Health Check ​

HAProxy Backend Health Check ​

Script-Based Monitoring ​

Error Responses ​

Service Unavailable (503) ​

Internal Server Error (500) ​

Security Considerations ​

Production Deployment ​

Detail Level Impact ​

Best Practices ​

Monitoring Setup ​

Development Testing ​

Health Check API Reference

Overview

Security Configuration

Endpoints

`GET /health`

`GET /health/live`

`GET /health/ready`

Health Status Logic

Overall Status Determination

Server Health Classification

Rate Limiting

Monitoring Integration Examples

Kubernetes Health Probes

Docker Compose Health Check

HAProxy Backend Health Check

Script-Based Monitoring

Error Responses

Service Unavailable (503)

Internal Server Error (500)

Security Considerations

Production Deployment

Detail Level Impact

Best Practices

Monitoring Setup

Development Testing