cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Mohammed.Rayan
AppDynamics Team

If an Events Service cluster is stuck in a RED state and it's due to unassigned shards, you will need to identify the shards, determine the cause, and re-allocate them.

 

“events-service-api-store / elasticsearch-singlenode-module” : {
	“healthy” : false,
	“message” : “Current [appdynamics-events-service-cluster]
cluster state: [RED]. data nodes: [1], nodes: [1}, active shards:
[3], relocating shards: [0], initializing shards: [0], unassigned 
shards; [8], timed out: [false]
	}
}

 

 

How do I troubleshoot the issue's cause?

  1. Use the following command to find the root cause of the issue:
curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED

 

Elasticsearch gives the following reasons for a shard to be in an unassigned state:

INDEX_CREATED

Unassigned as a result of an API creation of an index.

CLUSTER_RECOVERED

Unassigned as a result of a full cluster recovery.

INDEX_REOPENED

Unassigned as a result of opening a closed index.

DANGLING_INDEX_IMPORTED

Unassigned as a result of importing a dangling index.

NEW_INDEX_RESTORED

Unassigned as a result of restoring into a new index.

EXISTING_INDEX_RESTORED

Unassigned as a result of restoring into a closed index.

REPLICA_ADDED

Unassigned as a result of explicit addition of a replica.

ALLOCATION_FAILED

Unassigned as a result of a failed allocation of the shard.

NODE_LEFT

Unassigned as a result of the node hosting it leaving the cluster.

REROUTE_CANCELLED

Unassigned as a result of explicit cancel reroute command.

REINITIALIZED

When a shard moves from started back to initializing, for example, with shadow replicas.

REALLOCATED_REPLICA

A better replica location is identified and causes the existing replica allocation to be canceled.

 

  1. Use the following health check, shards, and indices command to determine the number of shards and which shards are in RED state:
curl http://<events_service_machine>:9081/healthcheck?pretty=true
curl http://<events_service_machine>:9200/_cat/shards?v
curl http://<events_service_machine>:9200/_cat/indices?v

 

Output from the above health check commands:

 "events-service-api-store / elasticsearch-singlenode-module" : {
    "healthy" : false,
    "message" : "Current [appdynamics-events-service-cluster] cluster state: [RED], data nodes: [1], nodes: [1], active shards: [3], relocating shards: [0], initializing shards: [0], unassigned shards: [12], timed out: [false]"
  }
}

appdynamics_meters_v2 0 p UNASSIGNED CLUSTER_RECOVERED appdynamics_meters_v2 1 p UNASSIGNED CLUSTER_RECOVERED appdynamics_api_keys_v1 0 p UNASSIGNED CLUSTER_RECOVERED appdynamics_api_keys_v1 1 p UNASSIGNED CLUSTER_RECOVERED appdynamics_accounts 0 p UNASSIGNED CLUSTER_RECOVERED appdynamics_accounts 1 p UNASSIGNED CLUSTER_RECOVERED event_type_metadata 0 p UNASSIGNED CLUSTER_RECOVERED event_type_metadata 1 p UNASSIGNED CLUSTER_RECOVERED event_type_extracted_fields 0 p UNASSIGNED CLUSTER_RECOVERED event_type_extracted_fields 1 p UNASSIGNED CLUSTER_RECOVERED

 

How do I reallocate the shards?

  1. The solution is to manually allocate the shards using the reroute API available from elasticsearch.

    In the above output example, there are 10 unassigned shards: each shard must be separately allocated.
 curl -XPOST 'localhost:9200/_cluster/reroute' -d '{ 
"commands" : [ { 
"allocate" : { 
"index" : "appdynamics_accounts", 
"shard" : 0, 
"node" : "node-12345798-7867-4ea1-b546-557a6bc546afg", 
"allow_primary" : true 
} 
}
]}'

 

You need to update the node with the IP address or the name of the node, and then run the appropriate script, from below

 

Without Primary:

curl http://localhost:9200/_cat/shards?v  2>/dev/null |grep UNASSIGNED | sed -e 's/\([^ ]*\) *\([^ ]*\) *.*/\1 \2/'|while read line
do
indexname=`echo $line|cut -d" "-f1`
shardid=`echo $line|cut -d" "-f2`
echo $indexname $shardid
curl -XPOST 'http://localhost:9200/_cluster/reroute'-d "{
\"commands\": [{
\"allocate\": {
\"index\": \"$indexname\",
\"shard\": $shardid,
\"node\": \"10.90.3.125\",
\"allow_primary\": 0
}
}]
}"
done

 

For Primary shards:

curl http://localhost:9200/_cat/shards?v  2>/dev/null |grep UNASSIGNED | sed -e 's/\([^ ]*\) *\([^ ]*\) *.*/\1 \2/'|while read line
do
indexname=`echo $line|cut -d" "-f1`
shardid=`echo $line|cut -d" "-f2`
echo $indexname $shardid
curl -XPOST 'http://localhost:9200/_cluster/reroute'-d "{
\"commands\": [{
\"allocate\": {
\"index\": \"$indexname\",
\"shard\": $shardid,
\"node\": \"10.90.3.125\",
\"allow_primary\": 1
}
}]
}"
done
Comments
Jesse.Deibele
Voyager

Here is a powershell equivalent if you are running on Windows and don't want to install curl...

 

 

 

$postParams=@'
{
	"commands": [{
		"allocate": {
			"index": "appdynamics_accounts",
			"shard": 0,
			"node": "node-fa635229-8114-4ff8-a3b6-eee4449fkdsaaaa",
			"allow_primary": true
		}
	}]
}
'@

Invoke-WebRequest -Uri http://localhost:9200/_cluster/reroute -ContentType "application/json" -Method POST -Body $postParams

Don't forget. You may need to enable the http property (ad.es.node.http.enabled=true) in the "events-service-api-store.properties" file under events-service/processor/conf to allow port 9200 activity. 

Version history
Last update:
‎05-14-2020 09:26 PM
Updated by:
On-Demand Webinar
Discover new Splunk integrations and AI innovations for Cisco AppDynamics.


Register Now!

Observe and Explore
Dive into our Community Blog for the Latest Insights and Updates!


Read the blog here