Knowledge Base

cancel
Showing results for 
Search instead for 
Did you mean: 

Why are snapshots missing in the Controller?

Updated on 7/30/18

 

This article describes some common issues that can explain why expected snapshots might be missing in the Controller. The issues are usually specific to the monitored application, its business transaction characteristics, and the load on the application.

 

Contents

 

UI Issues

 

UI Limit

There is a 10,000 UI limit for displaying snapshots. You cannot fetch additional snapshots when more than 10,000 will be displayed.

 

Time Range Selection

Be sure the time range selected for display matches the time range of the expected snapshots.

 

Configuration Issues

 

Snapshot Thresholds

By default, AppDynamics collects a snapshot every ten minutes. You can modify the default values for slow, very slow, and stalled requests to generate snapshots more frequently. However, changing any threshold or limit from the default value can cause additional overhead on both application and Controller performance. The predefined limits help ensure minimal impact to the application being monitored so we don’t normally recommended adjusting them. See the documentation: Configure Snapshot Periodic Collection Frequency

 

Stall Detection

Disabling Stall Transaction Detection can prevent incomplete transactions from being reported. If disabled, please re-enable it and select the checkbox to apply this on existing business transactions.

 

Long-Running Transactions

For long-running transactions, the threshold values for slow and very slow might need to be set lower than the baseline response times. We have seen cases where snapshots are abandoned when execution time is taking six times the default stall threshold. If snapshots are not captured for requests taking more than X amount of time, increase the stall threshold.

 

Tuning Threshold Settings

To maximize the value of snapshots or to avoid hitting agent or Controller limits, tune the threshold settings for specific business transactions (BTs) on the Configure -> "Slow Transaction Thresholds" screen. The default snapshot thresholds might not be appropriate for BTs in some scenarios where:

  • There is high load (e.g., in the context of periodic snapshot settings)
  • Average response time is, by default, high in a test environment
  • Slow/stall thresholds and periodic settings are configured at the application level (which applies to all BTs) and are not appropriate for all BTs. For example, it might be valuable to collect one snapshot every 100 executions for only specific BTs. In this case, configure thresholds and sampling at the BT level and optimize snapshot settings for specific BTs.

 

Data Retention and Limits

 

Data Retention

By default, AppDynamics stores snapshots up to two weeks (336 hours). This is configurable using the snapshots.retention.period setting. See the documentation:  Database Size and Data Retention.

 

Agent Limits

There are node properties that limit the number of snapshots per minute and the number of elements (calls) in the snapshot call graph:

  • max-snapshots-per-minute limit - default value is 20
  • max-call-elements-per-snapshot - default value is 5000

The max-snapshots-per-minute limit is distributed across the BTs as needed and is not strictly divided evenly among them. This is a per minute limit, not a per BT limit.

 

You will see the log message below when max-call-elements-per-snapshot is exceeded. If seen, try increasing the limit by setting the property max-call-elements-per-snapshot on the node.

 

---------- 
[AD Thread Pool-CGG0] 11 Jul 2014 10:14:06,198 WARN Snapshot - Aborting
snapshot started from
org.jboss.web.tomcat.filters.ReplyHeaderFilter:doFilter [line -1] because
the number of calls [5019] exceeded threshold [5000]

-------------

 

Controller Limits

The CONTROLLER_RSD_UPLOAD_LIMIT_REACHED event type indicates that your Controller has reached the limit for the number of RSDs (Request Segment Data) that can be uploaded per minute from this account. Once the limit is reached no more RSDs — other than certain key ones —  are uploaded for that minute. This is a safety feature to avoid overhead on the agent nodes and overloading of the Controller. The counts (normal/slow...etc) for the requests are incremented irrespective of this warning.

 

You can tune the Controller settings using the rsds.upload.limit.per.min property. This value may be adjusted in the field as appropriate (without bouncing the Controller) and the new limits become effective within the next minute. However, we do not recommend doing this without first reviewing your threshold settings.

 

There are other limits at the Controller for the Data Buffers. If any snapshots are missing and the agent trace logs don't show any indication of dropping snapshots on the agent side, please enable the loggers below in the Controller (On-Prem) in the logging.properties file located at <CONTROLLER_HOME>\appserver\glassfish\domains\domain1\config

 

  1. com.appdynamics.SNAPSHOTS_UPLOAD.level = Fine
  2. com.appdynamics.SNAPSHOTS.WRITE.level = Fine

If you notice the message below in the Controller server logs, it is the indication that the data buffer limits have been reached in the Controller.

 

[#|2017-11-29T13:54:07.531+0100|FINE|glassfish 
4.1|com.appdynamics.SNAPSHOTS.WRITE|_ThreadID=40;_ThreadName=http-listener-1
(13);_TimeMillis=1511960047531;_LevelValue=500;ClassName=com.singularity.ee.
controller.beans.logging.RepetitiveMessageLogger;MethodName=log;|Snapshot
ExitCall
Data Buffer full. Dropping data. Buffer Size: 2097152 bytes|#] 

 

You can increase the below Controller property values from the admin console. The default value is 8 MB for each property. Please note that higher values will impact the controller performance.

 

  1. process.snapshots.buffer.size
  2. snapshots.buffer.size

The above changes should allow the snapshots to show in the UI, but if the snapshot size is huge, then the snapshot may not open and will show the query TIMEOUT exception in the UI. Please note that this query execution time can not be changed and this is the limitation at the Controller side to avoid the controller performance issues.

 

To avoid the situation above, enable agent side tracing to figure out how much data a BT snapshot is pushing to the Controller and what data it is pushing. Based on the data, you can fine tune your application or agent instrumentation to blacklist a few of the less important data segments to the Controller.

 

Node Property Settings

Several node properties relate to transaction snapshots. Confirm that the settings are not affecting snapshot generation in your specific scenario. You can view a list of node properties here: App Agent Node Properties Reference 

 

Business Transaction Limit

If you have reached the business transaction limit, confirm that you exclude the business transactions that you are not interested in. This is to make sure that snapshot/callgraph limits are not reached due to load on overflow BTs on the affected node.

 

Warnings or Errors in Logs

Search the log files for strings such as "WARN Snapshot - Aborting snapshot" or error messages to see if there are related messages that can provide insight on the issue.

 

File permissions

If you find the following error in the logs, there could be an issue with agent directory file permissions.

httpSSLWorkerThread-8080-12 20 Dec 2012 03:02:24,082 ERROR TransactionSnapshotService - Error in starting snapshot for Sampled error request
java.lang.UnsupportedOperationException
at com.singularity.ee.agent.appagent.services.transactionmonitor.common.uf.c(uf.java:260)
at com.singularity.ee.agent.appagent.services.transactionmonitor.common.ye.w(ye.java:362)
at com.singularity.ee.agent.appagent.services.snapshot.ac.<init>(ac.java:99)
at com.singularity.ee.agent.appagent.services.transactionmonitor.common.ye.a(ye.java:195)
at com.singularity.ee.agent.appagent.services.snapshot.u.a(u.java:423)
at com.singularity.ee.agent.appagent.services.snapshot.jc.a(jc.java:437)
at com.singularity.ee.agent.appagent.services.transactionmonitor.common.cg.a(cg.java:523)
at com.singularity.ee.agent.appagent.services.transactionmonitor.common.c.a(c.java:391)
at com.singularity.ee.agent.appagent.services.transactionmonitor.common.c.a(c.java:278)
at com.singularity.ee.agent.appagent.services.bciengine.a.onMethodEnd(a.java:60)
at com.singularity.ee.agent.appagent.entrypoint.bciengine.FastMethodInterceptorDelegator.safeOnMethodEndNoReentrantCheck(FastMethodInterceptorDelegator.java:204)
at com.singularity.ee.agent.appagent.entrypoint.bciengine.FastMethodInterceptorDelegator.safeOnMethodEnd(FastMethodInterceptorDelegator.java:156)
at com.sun.xml.ws.server.sei.EndpointMethodHandler.invoke(EndpointMethodHandler.java:275)

 

.NET IIS specific issues

 

HTTP 401 Exceptions

No snapshots are collected for 401 Exceptions because this is an HTTP Error 401: unauthorized access error. This error is handled before the entry point for any protocol (ASP/.NET/Web services/etc.). We capture it with its details as displayed but it is not possible to capture a snapshot since the error occurs too early in the IIS pipeline.

 

HTTP 400 Exceptions

IIS rejects a request and returns 400 before we start instrumenting, so we are unable to capture the snapshot in the case of a 400 error. Review more details about debugging 400 errors here.

 

Version history
Revision #:
7 of 7
Last update:
‎09-14-2018 11:25 AM
Updated by:
 
0 Kudos
Comments

Hi, 

 

I'm using a php agent. And I have my proxy who stops to work. In the logs I have this message : 

Skipping the snapshot for request:2955781218662900689 of bt:510 since the limit for SLOW snapshots has been reached.

 

Do you think using those parameters : 

  • max-snapshots-per-minute limit - default value is 20
  • max-call-elements-per-snapshot - default value is 5000

will solve my issue ?

 

Thanks 

Michael