Infrastructure (Server, Network, Database)

cancel
Showing results for 
Search instead for 
Did you mean: 

machine agent down health rule

machine agent down health rule

Hello,

In a recent prod incident, we had a server crash and sysadmins rebooted the server , but did not reboot our application on it since they did not know how.

The way we have configured machine agent on this app is that it comes up as part of the application bootup.

 

We are looking  at catching this scenario where app remained down for a extended period of time.

 

We are thinking we could use a "machine agent down" health rule.

 

AM i on right path?

 

Also how can I setup just a machine agent down health rule ?

 

I tried the App_availability  metric but that reports when app agent is not responding which seems to happen every so often on our app and app agent is at 95%.

I would like to eveluate just machine agent down health rule being at 0% when ideally it is 100%

 

Advise please

Tags (1)
machine agent down health rule