Overview
The health check endpoint is used to monitor the operational status of WuKongIM server and cluster, ensuring the system is running normally.
curl -X GET "http://localhost:5001/health"
Success Response (200)
Error Response (500)
Response Fields
Health status: ok indicates normal, error indicates abnormal
Error message (only appears when status is error)
Status Codes
Status Code Description 200 Server health status is normal 500 Server or cluster status is abnormal
Use Cases
Load Balancer Health Checks
Nginx Configuration :
upstream wukongim_backend {
server 192.168.1.10:5001;
server 192.168.1.11:5001;
server 192.168.1.12:5001;
}
server {
location /health {
proxy_pass http://wukongim_backend/health;
proxy_connect_timeout 5s ;
proxy_read_timeout 5s ;
}
location / {
proxy_pass http://wukongim_backend;
# Health check configuration
health_check uri=/health interval=30s fails=3 passes=2;
}
}
HAProxy Configuration :
backend wukongim_servers
balance roundrobin
option httpchk GET /health
http-check expect status 200
server wk1 192.168.1.10:5001 check inter 30s
server wk2 192.168.1.11:5001 check inter 30s
server wk3 192.168.1.12:5001 check inter 30s
Container Orchestration
Docker Compose :
version : '3.7'
services :
wukongim :
image : registry.cn-shanghai.aliyuncs.com/wukongim/wukongim:v2
healthcheck :
test : [ "CMD" , "wget" , "-q" , "--spider" , "http://localhost:5001/health" ]
interval : 30s
timeout : 10s
retries : 3
start_period : 40s
ports :
- "5001:5001"
Kubernetes Deployment :
apiVersion : apps/v1
kind : Deployment
metadata :
name : wukongim
spec :
replicas : 3
selector :
matchLabels :
app : wukongim
template :
metadata :
labels :
app : wukongim
spec :
containers :
- name : wukongim
image : registry.cn-shanghai.aliyuncs.com/wukongim/wukongim:v2
ports :
- containerPort : 5001
livenessProbe :
httpGet :
path : /health
port : 5001
initialDelaySeconds : 30
periodSeconds : 30
timeoutSeconds : 10
failureThreshold : 3
readinessProbe :
httpGet :
path : /health
port : 5001
initialDelaySeconds : 10
periodSeconds : 10
timeoutSeconds : 5
failureThreshold : 3
Monitoring and Alerting
Prometheus Monitoring :
# prometheus.yml
scrape_configs :
- job_name : 'wukongim-health'
metrics_path : '/health'
static_configs :
- targets : [ '192.168.1.10:5001' , '192.168.1.11:5001' , '192.168.1.12:5001' ]
scrape_interval : 30s
scrape_timeout : 10s
Custom Health Check Script :
#!/bin/bash
SERVERS = ( "192.168.1.10:5001" "192.168.1.11:5001" "192.168.1.12:5001" )
WEBHOOK_URL = "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
for server in "${ SERVERS [ @ ]}" ; do
response = $( curl -s -o /dev/null -w "%{http_code}" "http:// $server /health" --max-time 10 )
if [ " $response " != "200" ]; then
# Send alert
curl -X POST -H 'Content-type: application/json' \
--data "{ \" text \" : \" 🚨 WuKongIM Health Check Failed: $server returned $response \" }" \
" $WEBHOOK_URL "
fi
done
Application Integration
Service Discovery :
class WuKongIMServiceDiscovery {
constructor ( servers ) {
this . servers = servers ;
this . healthyServers = [];
this . checkInterval = 30000 ; // 30 seconds
this . startHealthChecks ();
}
async checkServerHealth ( server ) {
try {
const response = await fetch ( `http:// ${ server } /health` , {
timeout: 5000
});
return response . status === 200 ;
} catch ( error ) {
console . error ( `Health check failed for ${ server } :` , error );
return false ;
}
}
async updateHealthyServers () {
const healthChecks = this . servers . map ( async ( server ) => {
const isHealthy = await this . checkServerHealth ( server );
return { server , isHealthy };
});
const results = await Promise . all ( healthChecks );
this . healthyServers = results
. filter ( result => result . isHealthy )
. map ( result => result . server );
console . log ( 'Healthy servers:' , this . healthyServers );
}
startHealthChecks () {
this . updateHealthyServers ();
setInterval (() => {
this . updateHealthyServers ();
}, this . checkInterval );
}
getHealthyServer () {
if ( this . healthyServers . length === 0 ) {
throw new Error ( 'No healthy WuKongIM servers available' );
}
// Round-robin selection
const server = this . healthyServers [ Math . floor ( Math . random () * this . healthyServers . length )];
return server ;
}
}
// Usage
const discovery = new WuKongIMServiceDiscovery ([
'192.168.1.10:5001' ,
'192.168.1.11:5001' ,
'192.168.1.12:5001'
]);
Best Practices
Monitoring Frequency : Recommended to check health status every 30-60 seconds
Timeout Settings : Set reasonable timeout values to avoid false alarms
Load Balancing : Can be used for load balancer health checks
Container Orchestration : Suitable for Docker and Kubernetes health check configurations
Alerting Mechanism : Integrate with monitoring systems for automated alerting
Graceful Degradation : Implement fallback mechanisms when health checks fail
Circuit Breaker : Use circuit breaker pattern to handle unhealthy services
Logging : Log health check results for troubleshooting and analysis