Dashboards

cancel
Showing results for 
Search instead for 
Did you mean: 

Time based health rules not possible?

Adventurer

Time based health rules not possible?

Why doesn't AppD have time settings for health rules?

Examples:

"trigger alert only if threshold is breached every polling minute for X minutes"

"trigger alert if threshold is breached X number of times in the last X minutes"

 

For business transactions, I'm using the standard baseline settings for average response time health rules, but my business transactions are filling up my inbox with violations that are not really useful. I can't turn these rules off or I'll miss any serious issues that show up ( a continuously sick node or worse ). 

 

I need health rules to only trigger if average response time, for a specific node, crosses a threshold AND also stays above a threshold for x minutes/seconds. There is no need to alert out for a single transaction that succeeds but takes a long time, when 99.9% are within limits. I can't find a simple way to filter this noise out without watering down detection accross the larger pool of nodes.

 

The only way I can see to do this is to play around with the baseline, which then raises the ART and reduces effectiveness for large scale problems. Please enlighten me if I'm missing something.

 

This time based option is standard on most other APMs I've used. I see you can build custom rules, but there seems to be little documentation on how to do this (no examples that I can find).

 

TIA.

Time based health rules not possible?
3 REPLIES
Architect

Re: Time based health rules not possible?

Hi Steward

 

To your question:

I need health rules to only trigger if average response time, for a specific node, crosses a threshold AND also stays above a threshold for x minutes/seconds. There is no need to alert out for a single transaction that succeeds but takes a long time, when 99.9% are within limits. I can't find a simple way to filter this noise out without watering down detection accross the larger pool of nodes.

>>>

You can acheive this by using the configuration "you the last xx minutes of data", check point 5 on this link

https://docs.appdynamics.com/display/PRO44/Configure+Health+Rules#ConfigureHealthRules-UsetheHealthR...

 

Let me know if the above helps.

 

Also, in the policy, have you turned or checked box for Slow transactions under Other events?

if so, you may want to disable that as that will send out an email for slow transactions every minute.

 

Thanks,

Gurmit.

Adventurer

Re: Time based health rules not possible?

Thanks Gurmit.

 

This solution does not work though. It would still trigger alerts regardless of 1 minute or 30 minutes or 360 minutes of evaluation time. It only takes one transaction to go beyond the threshold momentarily during the evaluation time window, and an alert is triggered. The only way to prevent this, that I can see, is to use the BT average accross all nodes (not "per node" -> if any node has a spike... send an alert). The downside is that 1 sick node is likely to be hidden by the other 50 nodes in the pool. There doesn't look to be a nice middle ground.

 

These are specific BTs, not the default BT health alert that includes all transactions. So, I cannot select/un-select slow transactions.

 

Thanks.

Creator

Re: Time based health rules not possible?

I'm having a similar issue with a different metric. A quick 1 min spike causes the average to go above 3 standard deviations and generates an alert. By the time I can check out the alert the spike is gone and everything is back to normal. The application is running normally. 

 

Is there any solution to this ?