cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Sayantan.Mitra
AppDynamics Team

Resolving Issues with Missing Hardware and Custom Metrics in Server Visibility Agents


SIM machine agents a.k.a. Server Visibility agents are used to publish Hardware metrics for the underlying node or Servers that host applications. One SIM Machine Agent may correlate to multiple APM agents. Also, SIM machine agents may host custom extensions that use the Machine agent to piggyback custom metrics to the controller. In these scenarios, the metrics play a pivotal role in Application to Infrastructure correlation or custom app monitoring via custom extensions. A loss in the metrics from the Machine Agent would mean an actual loss of monitoring and thereby cause an actual revenue loss if not detected and remediated in time.

The loss of metrics can be for various reasons such as exceeding the metric limits of the agent or the controller, loss in connectivity, loss in the machine agent process from being devoted to CPU cycles, or issues with memory optimization of the Machine agent. However this article is specifically when the Machine agent is working on the server but we see the non-metrics i.e. total vCPU count, Total memory, and other details such as server tags which are not metrics i.e. variables over time; but not the Hardware or custom metrics which can be reported by the agent.

 

 Metric limits being hit

Metric limits exist both at the controller and at the agent side. By default, MA can publish a max of 450 metrics which may not be sufficient if there are a lot of volumes, networks, or process classes configured for the Machine Agent. Luckily agent side metrics can be quickly overridden as mentioned in the docs at https://docs.appdynamics.com/appd/24.x/24.8/en/application-monitoring/administer-app-server-agents/m.... The controller also has a limit on the account, application and total number of custom metrics which can be registered. If you notice metrics not being registered then one needs to increase the corresponding limit on the controller.

 

The issue with the SIM extension/module being initialized

When the MA is started with the SIM flag set to true the MA first tries ensuring a license exists, if it does then the MA registers and then makes some API calls - first for controller server time, second for if the MA is enabled for monitoring in the controller i.e. not disabled. Once these are successful it enables the Servermonitoring extension which is present by default in all Machine agent binary. Now if the Servermonitoring files are corrupt or have an indentation issue you may see a warning in the MA logs:

 

 

WARN UriConfigProvider - Could not deserialize configuration at file:<MA_HOME>/extensions/ServerMonitoring/conf/ServerMonitoring.yml
com.fasterxml.jackson.dataformat.yaml.snakeyaml.error.MarkedYAMLException: while scanning for the next token
found character '\t(TAB)' that cannot start any token. (Do not use \t(TAB) for indentation)
.
.
.

 at [Source: (byte[])"# WARNING: Before making any changes to this file read the following section carefully
#
# After editing the file, make sure the file follows the yml syntax. Common issues include
# - Using tabs instead of spaces
# - File encoding should be UTF-8
#
# The safest way to edit this file is to copy paste the examples provided and make the
# necessary changes using a plain text editor instead of a WYSIWYG editor.

 

 

The above example was taken when the servermonitoring yml file was modified using tabs instead of spaces as it's a yml file but the general idea is that the Servermonitoring extension must initialize for data to be sent by it.

Having these checked will give one more ideas as to why an incomplete set of metrics were sent to the controller via any Machine agent. 

Version history
Last update:
‎09-10-2024 10:40 AM
Updated by:
On-Demand Webinar
Discover new Splunk integrations and AI innovations for Cisco AppDynamics.


Register Now!

Observe and Explore
Dive into our Community Blog for the Latest Insights and Updates!


Read the blog here