Not a customer? Click the 'Start a free trial' link to begin a 30-day SaaS trial of our product and to join our community.
Existing Cisco AppDynamics customers should click the 'Sign In' button to authenticate to access the community
05-04-2020 05:46 AM - last edited on 05-04-2020 10:53 AM by Ryan.Paredez
Chaos Engineering & Observability
Estimated Reading time: 4 mins
Steady-state is not the only set of metrics that one should observe when running your experiments. Observability in chaos engineering extends to experiments themselves. One of the important facts that Observability implies is that once you observe deviations in the application's steady-state metrics, you should also be able to correlate it with events that could cause such deviation.
Event Browser in AppDynamics is one such place where one can discover many events that either agents publish or applications create. Examples of events are health rule violations, application restarts, JVM crashes, or any custom events that a developer may choose to publish.
Gremlin
Gremlin is a favorite tool to bring chaos engineering practices and culture into your organization. It can help you design, run, analyze, and collaborate chaos engineering experiments. You can read more about this platform on its website.
Use case
If you design and run your chaos experiments using Gremlin's platform and observe your application's steady-state metrics using AppDynamics, then there is no straight way to both observe your experiment and metrics in the same pane of monitoring. So we try to solve this problem using AppDynamics Events.
Publishing Gremlin Experiment into AppDynamics Events
Gremlin's experiments once initiated go through various stages. These stages describe the lifecycle of the experiment itself. When an experiment is in the 'RUNNING' stage, it means the attack is being run on the target. One can query the Gremlin attack API and if the attack is found to be RUNNING, the same event can be published in AppDynamics. Here is how to do it using a quick code in Python3.6.
Poll Gremlin Experiment Status
def pollExperimentStatus(guid):
url = 'https://api.gremlin.com/v1/attacks/'+str(guid)
h = json.dumps({"Authorization": "Key xxxxxxxx"})
i = 0
try:
#Poll experiment status
expdata = requests.get(url,headers=json.loads(h))
#Identify STAGE from the response
expdatajson = json.loads(expdata.text)
stage = expdatajson['stage']
while stage != "Successful":
time.sleep(2) #poll every 2 seconds
#Identify STAGE from the response
expdata = requests.get(url,headers=json.loads(h))
expdatajson = json.loads(expdata.text)
stage = expdatajson['stage']
if stage == "Running" and i == 0:
raiseAppdEvent(guid)
i = i + 1
...
...
...
except Exception as err:
print(f'Error occurred: {err}')
Publish Event in AppDynamics
def raiseAppdEvent(guid):
try:
appdConfig,a_range = getConfig('appdSettings') ## custom function to get url, etc.
eventsURL = appdConfig['eventsURL']
token = getAppdToken() ## custom function for oath token
eventsHeaders = {'Authorization': 'Bearer {}'.format(token)}
eventsURL=eventsURL + "?" + "severity=INFO&summary=Gremlin Experiment {}&eventtype=CUSTOM&customeventtype=Gremlin".format(guid)
response = requests.post(eventsURL,headers=eventsHeaders)
response.raise_for_status()
except Exception as err:
print(f'ERROR: event : {err}')
AppDynamics complements SRE practice
Using such a simple mechanism of querying experiment data, in this case, experiment GUID, from Gremlin and using it to publish an event in AppDynamics proves really helpful when observing your application during experiments.
By registering your experiment details into AppDynamics, one can use a single pane of monitoring to observe your experiments. You can not only observe but create policies to trigger custom actions like notifying your service owners about the start and end of an experiment by leveraging your existing monitoring infrastructure and investment.
Happy Experimenting!
User | Count |
---|---|
3 | |
1 | |
1 | |
1 | |
1 |
Thank you! Your submission has been received!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form