Knowledge Base

cancel
Showing results for 
Search instead for 
Did you mean: 

How to troubleshoot an Events Service process hung during startup

Symptoms

 

When starting the Events Service, the database does not report the events data. The following error message appears in the Controller UI under the Databases tab.

 

database_screenshot.png

 

 

Additional troubleshooting in the Controller's server.log file indicates that the Events Service process has stopped.

 

0500|SEVERE|glassfish3.1.2|com.singularity.ee.controller.beans.ExceptionHandlingInterceptor|_ThreadID=120;_ThreadName=Thread-5;|Encountered runtime exception com.appdynamics.analytics.shared.rest.exceptions.ClientException: Could not execute request to http://localhost:9080/v2/events/dbmon-wait-time
atcom.appdynamics.analytics.shared.rest.client.utils.GenericHttpRequestBuilder.getResponse(GenericHttpRequestBuilder.java:224)  
atcom.appdynamics.analytics.shared.rest.client.utils.GenericHttpRequestBuilder.executeAndReturnRawResponseString(GenericHttpRequestBuilder.java:238)
atcom.appdynamics.analytics.shared.rest.client.eventservice.DefaultEventServiceClient.registerEventType(DefaultEventServiceClient.java:132)

 

 

Diagnosis and Troubleshooting

 

1. Find the process ID from the output of the ps -ef | grep -i event-service command

52676 11537 1 0 03:58 pts/3 00:00:11 /opt/AppDynamics/Controller/jre/bin/java -Xmx6144m -Xms6144m -Xss256k -Djava.net.preferIPv4Stack=true -Dfile.encoding=UTF-8 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -verbose:gc -XX:GCLogFileSize=256m -XX:NumberOfGCLogFiles=4 -XX:+UseGCLogFileRotation -Xloggc:/opt/AppDynamics/Controller/events_service/bin/../logs/events-service-api-store-gc.log -DAPPLICATION_HOME=/opt/AppDynamics/Controller/events_service/bin/.. -classpath /opt/AppDynamics/Controller/events_service/bin/../lib/* com.appdynamics.analytics.processor.AnalyticsService -p /opt/AppDynamics/Controller/events_service/conf/events-service-api-store.properties -y /opt/AppDynamics/Controller/events_service/bin/../conf/events-service-api-store.yml 52676 19976 9765 0 04:38 pts/3 00:00:00 grep events-service

 

2. Check the health state of the Events Service. It did not respond to the request.

 

curl http://<event-service-host>:9081/healthcheck?pretty=true

 

3. Check if the host and port of the Events Service are bound correctly using the netstat command.  There was no response to the request.

 

netstat -anp | grep 9080

 

4. If the process had hung during startup, use the kill -3 52676 (kill -3 <PID>) command to capture thread dumps and look for the hung process.

 

Note: Java thread dump output does to "stdout" and will be written to the nohup.out stdout file.

 

5. Within the thread dumps, find the hung process.

 

Example:

 

"main" #1 prio=5 os_prio=0 tid=0x00007fdb84011000 nid=0xe280 runnable [0x00007fdb88425000]
java.lang.Thread.State: RUNNABLE
at sun.nio.fs.UnixNativeDispatcher.stat0(Native Method)
at sun.nio.fs.UnixNativeDispatcher.stat(UnixNativeDispatcher.java:286)
at sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:70)
at sun.nio.fs.UnixFileStore.devFor(UnixFileStore.java:55)
at sun.nio.fs.UnixFileStore.(UnixFileStore.java:70)
at sun.nio.fs.LinuxFileStore.(LinuxFileStore.java:48)
at sun.nio.fs.LinuxFileSystem.getFileStore(LinuxFileSystem.java:112)
at sun.nio.fs.UnixFileSystem$FileStoreIterator.readNext(UnixFileSystem.java:213)
at sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext(UnixFileSystem.java:224)
- locked <0x000000065610cb50> (a sun.nio.fs.UnixFileSystem$FileStoreIterator)
at org.elasticsearch.env.NodeEnvironment.getFileStore(NodeEnvironment.java:267)
at org.elasticsearch.env.NodeEnvironment.access$000(NodeEnvironment.java:62)
at org.elasticsearch.env.NodeEnvironment$NodePath.(NodeEnvironment.java:75)
at org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:140)
at org.elasticsearch.node.internal.InternalNode.(InternalNode.java:165)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.node.NodeBuilder.node(NodeBuilder.java:166)
at com.appdynamics.analytics.processor.elasticsearch.node.single.ElasticSearchSingleNode.(ElasticSearchSingleNode.java:49)
at com.appdynamics.analytics.processor.elasticsearch.node.single.ElasticSearchSingleNode$$FastClassByGuice$$7b182632.newInstance() at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:60)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:254)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1031)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.Scopes$1$1.get(Scopes.java:65)
- locked <0x00000006534a90f0> (a java.lang.Class for com.google.inject.internal.InternalInjectorCreator)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:40)
at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:53)
at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:110)
at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:75)
at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:73)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1024)
at com.google.inject.internal.MembersInjectorImpl.injectAndNotify(MembersInjectorImpl.java:73)
at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:60)
at com.google.inject.internal.InjectorImpl.injectMembers(InjectorImpl.java:944)
at com.appdynamics.common.framework.Loaders.internalPrepareAndPreStart(Loaders.java:181)
at com.appdynamics.common.framework.Loaders.loadAndInitializeModules(Loaders.java:127)
at com.appdynamics.common.framework.AbstractApp.run(AbstractApp.java:311)
at com.appdynamics.common.framework.AbstractApp.run(AbstractApp.java:59)
at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:42)
at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:76)
at io.dropwizard.cli.Cli.run(Cli.java:70) at io.dropwizard.Application.run(Application.java:72)
at com.appdynamics.common.framework.AbstractApp.callRunServer(AbstractApp.java:267)
at com.appdynamics.common.framework.AbstractApp.runUsingFile(AbstractApp.java:261)
at com.appdynamics.common.framework.AbstractApp.runUsingTemplate(AbstractApp.java:248)
at com.appdynamics.common.framework.AbstractApp.runUsingTemplate(AbstractApp.java:167)
at com.appdynamics.analytics.processor.AnalyticsService.main(AnalyticsService.java:71)

 

6. When analyzing the stack trace, as the operating system admin, look for an issue with the Network File System (NFS) Mount.

 

7. Confirm that the NFS Mount was indeed hung. This was due to the NFS server migrating to the new host.

 

 

Solution

Unmounting and remounting the NFS server with the correct mount point resolves the issue.

 

Version history
Revision #:
1 of 1
Last update:
‎06-28-2017 11:26 AM
Updated by:
 
Labels (1)
Contributors
0 Kudos