Troubleshooting
Common Elasticsearch errors and fixes — vm.max_map_count, shard allocation, license issues, analyzer errors, disk watermarks, and cluster recovery.
25m15m reading10m lab
vm.max_map_count Is Too Low
Error
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
Elasticsearch uses mmap to access index files efficiently. The default Linux setting is too low.
Fix
# Temporary (resets on reboot)
sudo sysctl -w vm.max_map_count=262144
# Permanent
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
For Docker Desktop on macOS/Windows, set it in the Docker VM:
# Docker Desktop
docker run --privileged --pid=host -it alpine nsenter -t 1 -m -u -n -i \
sysctl -w vm.max_map_count=262144
Unsupported License Type
Error
{
"error": "unrecognised license type [trail]"
}
Cause
Typo in the license type. The correct value istrial, not trail.
Supported License Types
| License | Features |
|---|---|
basic | Free tier, core features |
standard | Basic security features |
gold | Monitoring, alerting |
platinum | ML, advanced security |
enterprise | All features |
trial | 30-day trial of all features |
Fix
curl -X POST "localhost:9200/_license/start_trial?acknowledge=true&pretty"
Or set in elasticsearch.yml:
xpack.license.self_generated.type: basic
Analyzer Not Found
Error
{
"type": "mapper_parsing_exception",
"reason": "analyzer [autocomplete_analyzer] has not been defined in the mapping"
}
Cause
A custom analyzer is referenced in mappings but was never defined in the index settings.Fix
Custom analyzers must be defined in settings.analysis at index creation time. You cannot add them to an existing index without closing it first.
curl -X PUT "localhost:9200/my-index" \
-H 'Content-Type: application/json' -d'
{
"settings": {
"analysis": {
"analyzer": {
"autocomplete_analyzer": {
"type": "custom",
"tokenizer": "edge_ngram_tokenizer",
"filter": ["lowercase"]
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 10,
"token_chars": ["letter", "digit"]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "autocomplete_analyzer"
}
}
}
}'
Shard Allocation Failures
Error
{
"index": "my-index",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned.reason": "CLUSTER_RECOVERED"
}
Diagnose
# View unassigned shards
curl -s "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason" | grep UNASSIGNED
# Get detailed allocation explanation
curl -s "localhost:9200/_cluster/allocation/explain?pretty"
Common Causes and Fixes
| Reason | Cause | Fix |
|---|---|---|
NODE_LEFT | Node went offline | Restart the node or wait for recovery |
ALLOCATION_FAILED | Corrupted shard data | Delete the index and restore from snapshot |
INDEX_CREATED | Not enough nodes for replicas | Add nodes or reduce number_of_replicas |
CLUSTER_RECOVERED | Post-restart rebalancing | Wait — Elasticsearch will auto-recover |
DISK_THRESHOLD | Disk watermark exceeded | Free disk space or adjust watermarks |
Force Reroute (Last Resort)
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true&pretty"
Cluster Red Status
Diagnose
# Check which indices are red
curl -s "localhost:9200/_cat/indices?v&health=red"
# Check cluster health details
curl -s "localhost:9200/_cluster/health?level=indices&pretty"
# View unassigned shards
curl -s "localhost:9200/_cat/shards?v" | grep UNASSIGNED
Common Fixes
- 1 Missing nodes: Start the offline nodes
- 2 Disk full: Free space, then clear read-only blocks
- 3 Corrupted index: Delete and restore from snapshot
- 4 Not enough nodes for shards: Reduce replica count
# Reduce replicas for all red indices
curl -X PUT "localhost:9200/red-index/_settings" \
-H 'Content-Type: application/json' -d'
{
"index.number_of_replicas": 0
}'
Disk Watermark Exceeded (Read-Only Index)
Error
{
"type": "cluster_block_exception",
"reason": "index [my-index] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)]"
}
Fix
# 1. Free disk space (delete old indices, logs, etc.)
# 2. Remove the read-only block
curl -X PUT "localhost:9200/my-index/_settings" \
-H 'Content-Type: application/json' -d'
{
"index.blocks.read_only_allow_delete": null
}'
# 3. Remove block from ALL indices
curl -X PUT "localhost:9200/_all/_settings" \
-H 'Content-Type: application/json' -d'
{
"index.blocks.read_only_allow_delete": null
}'
Connection Refused
Error
curl: (7) Failed to connect to localhost port 9200: Connection refused
Checklist
# 1. Is Elasticsearch running?
docker ps | grep elasticsearch
# or
systemctl status elasticsearch
# 2. Check logs for startup errors
docker logs elasticsearch
# or
tail -100 /var/log/elasticsearch/elasticsearch.log
# 3. Check if the port is bound
ss -tlnp | grep 9200
# 4. Check network.host setting
# If set to 0.0.0.0, ensure bootstrap checks pass
# If set to specific IP, curl that IP instead of localhost
Out of Memory (OOM)
Symptoms
- Elasticsearch process killed
-
java.lang.OutOfMemoryError: Java heap spacein logs - Node disappears from cluster
Fix
# Check current heap settings
curl -s "localhost:9200/_nodes/stats/jvm?pretty" | grep heap
# Set heap size (50% of available RAM, max 31GB)
# In jvm.options:
-Xms4g
-Xmx4g
# For Docker:
environment:
ES_JAVA_OPTS: "-Xms4g -Xmx4g"
Rules:
-
Always set
Xmsequal toXmx - Never exceed 50% of available RAM
- Never exceed 31GB (compressed oops limit)
Circuit Breaker Triggered
Error
{
"type": "circuit_breaking_exception",
"reason": "[parent] Data too large"
}
Fix
# Check circuit breaker stats
curl -s "localhost:9200/_nodes/stats/breaker?pretty"
# Temporarily increase (not recommended long-term)
curl -X PUT "localhost:9200/_cluster/settings" \
-H 'Content-Type: application/json' -d'
{
"persistent": {
"indices.breaker.total.limit": "95%"
}
}'
Better solutions:
- Increase JVM heap size
- Reduce the number of shards
- Optimize queries (avoid deep aggregations)
Slow Startup After Crash
Cause
Elasticsearch is replaying the transaction log (translog) to recover uncommitted writes.Monitor Recovery
# Check recovery progress
curl -s "localhost:9200/_cat/recovery?v&active_only=true"
# Check pending tasks
curl -s "localhost:9200/_cat/pending_tasks?v"
Speed Up Recovery
# Increase recovery speed (temporarily)
curl -X PUT "localhost:9200/_cluster/settings" \
-H 'Content-Type: application/json' -d'
{
"transient": {
"cluster.routing.allocation.node_concurrent_recoveries": 4,
"indices.recovery.max_bytes_per_sec": "200mb"
}
}'
Quick Reference
| Error | Quick Fix |
|---|---|
vm.max_map_count too low | sysctl -w vm.max_map_count=262144 |
unrecognised license type | Use trial not trail |
analyzer not defined | Add to settings.analysis at creation |
| Unassigned shards | Check allocation explain API |
| Cluster red | Check _cat/indices?health=red |
| Read-only index | Set read_only_allow_delete to null |
| Connection refused | Check if ES is running, check logs |
| OOM killed | Set heap to 50% RAM, max 31GB |
| Circuit breaker | Increase heap or optimize queries |
| Slow recovery | Increase max_bytes_per_sec |
Lab: Diagnose and Fix Issues
- 1 Start a cluster and deliberately fill disk to trigger watermarks
- 2 Remove the read-only block after freeing space
- 3 Stop a node and observe shard reallocation
-
4
Use
_cluster/allocation/explainto debug unassigned shards - 5 Check circuit breaker stats under load
Next Steps
- Cluster Monitoring — proactive health monitoring
- Backup & Restore — disaster recovery