Cisco AppDynamics Community

Anonymous · ‎02-02-2015

Sometimes errors may appear in an appserver log, but not in AppDynamics. Or you may run your own specific test suites and see that known errors are not being detected by AppDynamics. Here are some things to check.

1. Confirm Logging Framework is Supported

2. Confirm Error Limits Were Not Hit

3. Confirm Configuration for Ignored Exceptions, Errors, and Loggers

4. Missing HTTP Error Codes

1. Confirm Logging Framework is Supported

Your appserver may be using an unsupported logging framework. AppDynamics App Agent for Java supports the following logging frameworks:

Log4j 2
java.util.logging
New in 4.0 Simple Logging Facade for Java (SLF4J)
New in 4.0 Logback
Also see Java Supported Environments for the latest support.

AppDynamics App Agent for .NET supports the following logging frameworks:

Log4Net
NLog
Also see .NET Supported Environments

Scope of Support

In version 4.0, support was added for SLF4J, Logback, which implements SLF4J under the covers, and for Log4j 2.

The support extends to the following features of these logging libraries:

SLF4J, Logback

We support instrumenting classes that implement the slf4j interface. Logback uses slf4j natively, so we support logback also.

Supported Methods:

Logger.error(String)
Logger.error(Marker, String)
Logger.error(String, Throwable)
Logger.error(Marker, String, Throwable)

We do not support SLF4J error passed objects, for example, error(java.lang.String,%20java.lang.Object...)

Log4j 2.0

We instrument out of the box anything that implements the log4j2 Logger interface. Specifically, we support:

error(Marker marker, Message msg)
error(Marker marker, Object message)
error(Marker marker, String message)
error(Message msg)
error(Object message)
error(String message)
error(Marker marker, Message msg, Throwable t)
error(Marker marker, Object message, Throwable t)
error(Marker marker, String message, Throwable t)
error(Message msg, Throwable t)
error(Object message, Throwable t)
error(String message, Throwable t)

Also, fatal variants of all of the above are supported.

Notice that we don't support logger.logMessage(), log(), or any calls with Object ... params (meaning a Object[] params). We don't support the log() and logMessage() from ExtendedLogger.

For additional logging support

The solution is to configure a custom logger. See Configure a Custom Logger.

2. Confirm Error Limits Were Not Hit

Agent Error Limit

There is an agent metric limit of 5000 metrics that can be registered per Node and an agent limit of 500 ADDs (Application Diagnostic Data - this includes async threads, errors and exception registration, snapshots and so on).

If this limit is reached and the Agent attempts to create metrics beyond this threshold, you see AGENT_METRIC_REG_LIMIT_REACHED alert in the event list. You can increase this default limit but that might cause an increase in overhead. Sometimes hitting this limit can be indicative of misconfigurations in your application. Hitting this limit and a similar limit in the Controller can indicate that you have hit the business transaction or backend limits and you may need to change the default discovery rules.

What is a metric?
A metric is an identifier used to uniquely identify a particular statistic.
For example:

Application Infrastructure Performance|Author|JVM|Memory|Heap|Committed (MB)
Application Infrastructure Performance|Author|JVM|Memory|Heap|Used %
Application Infrastructure Performance|Author|JVM|Process CPU Usage %

All of the above are individual metrics registered from the node, against which the corresponding statistics data is collected and reported to the controller. At any particular point in time, the metric name remains the same but the value changes and that value is captured and reported.

This particular concept of a metric is internal to AppDynamics, however, it is helpful to understand how it works because of the self-imposed limits on the number of metrics that can be discovered. The limits help to minimize the AppDynamics footprint and overhead impact to an application. One limit is the maximum number of metrics that the agent creates. Once the limit is reached the agent does not create new metrics.

Q: What is the impact of exceeding this 5000 limit?
A. This limit is per agent. Once the limit is reached no new metrics are created, therefore no new activity is tracked. If you have more endpoints discovered those are not tracked.

Q: If that is true then does restarting the agent from the console reset this limit and hopefully get new endpoints monitored while perhaps not picking up some old defunct ones that were working towards the 5000 limit?
A: Once the metric is registered, it is present always for that agent whether or not load is present on that metric or not.

For example, once metrics corresponding to an HTTP backend are registered, it doesn't matter whether there are calls or not to that backend, those metrics are always counted against the limit. In a case such as this, you could increase the maximum metric limit or you can delete the backends that are not being used to free up those metrics. You may also need to revise your backend configuration to avoid registering so many backends.

Once you increase the limit (or free up metrics by deleting unneeded backends/components), it is not guaranteed that the new end points will be visible because it is possible that there are other statistics which will be detected first and use the added metrics capacity.

For example, if there are async calls in the application, but the agent was not able to create metrics for them due to the limit being reached, once there is metric capacity, those async-call-related metrics might be created first before the new endpoints are detected.

Solution: Revise Configuration

Verify that you are not exceeding other limits such as backend limits and BT limits. Hitting the metric limit can be a warning sign of a configuration problem specific to your environment.

For example, HTTP backends are discovered automatically using Port and Host properties. If the configuration were changed to use the entire URL, your might rapidly exceed the backend limits and cause the metric limit to be hit, when the real problem is the HTTP backend configuration.

Solution: Increase Limit

The limit of 5000 suffices much of the time, but if truly needed it can be increased if you think there are calls missing or some functionality is not being captured by the agent. On average, agents register 800 metrics across applications. The lower end is 300 and some applications produce 1500 metrics per agent. If agents need more than 5000 metrics, something else is often wrong and raising the limit rather obscures the underlying problem.

To increase the limit see Metrics Limits documentation.

NOTE: Before increasing the metric limit, be sure you have verified no other limits are being hit.

Solution: Modify Default Error Detection Rules

If there are errors or exceptions that are well-known and don't need to be monitoried, you can exclude them from detection and free up metric capacity. Review the documentation here: Configure Exceptions and Log Messages to Ignore.

Controller Metric Limit

There is a similar metric limit at the controller level. When this limit is reached, the controller issues the CONTROLLER_ERROR_ADD_REG_LIMIT_REACHED event.

Solutions

The recommended solution is to fine-tune the default error detection rules, for example exclude the ones you're not interested in. Review the documentation here: Configure Exceptions and Log Messages to Ignore.
Increase the default limit to 4000, for example:

a. Login to the admin page http://<controller ip>:<port>/controller/admin.html
b. Enter root password (default value is changeme)
c. Change the value of the property 'error.registration.limit' accordingly (see attached screenshot)

Note: Increasing the limit incurs additional overhead so be sure to verify that you need to monitor all the discovered errors and exceptions.

3. Confirm Configuration for Ignored Exceptions, Errors, and Loggers

If exclude rules are misconfigured, exceptions might be missed. Review the error detection configuration in your application. From Controller UI -> select Configure -> Instrumentation -> select Error Detection tab. For more details, see: Monitor Errors and Exceptions

4. Missing HTTP Error Codes

AppDynamics reports error codes when the sendError method is used to report the error code. However, for some implementations of HttpServletResponse some HTTP errors are sent using setStatus. In this case, pset the capture-set-status (Java agent only) node property to true to capture these HTTP errors. For more details see the node property reference documentation: App Agent Node Properties Reference.

Cisco AppDynamics Community

What should I do if expected errors or exceptions are not showing up?

1. Confirm Logging Framework is Supported

Scope of Support

SLF4J, Logback

Log4j 2.0

For additional logging support

2. Confirm Error Limits Were Not Hit

Agent Error Limit

Solution: Revise Configuration

Solution: Increase Limit

Solution: Modify Default Error Detection Rules

Controller Metric Limit

Solutions

3. Confirm Configuration for Ignored Exceptions, Errors, and Loggers

4. Missing HTTP Error Codes