Cisco AppDynamics Community

Ahmed.Dhanani · ‎04-07-2020

I started recently with AppDynamics. My primary purpose of using AppDynamics is to find out the bottlenecks in my existing code base when my application is under a high load. I am running the PyAgent on one of the nodes. I am using Gunicorn as my app server which sits behind NGINX. I spin up my service with the following command:

ExecStart=venv/bin/pyagent run -c /etc/appdynamics.cfg -- venv/bin/gunicorn --bind 127.0.0.1:18000  -w 2 --worker-class gevent --worker-connections 500 foree:app --log-level debug

As I run my load test using locust, I see a number of transactions in the category of very slow. When I look further into the transaction snapshot I see a node present there called {request}- (as shown in the image). Surprisingly this transaction took 27s. I am pretty sure this is something related to gevent, but not exactly sure as to what is taking so much time? Could this be the time that the transaction spent waiting (blocked) on some kind of IO? Any pointers to further investigation would be highly appreciated.

Kyle.Furlong · ‎04-08-2020

Hi Ahmed,

This is indeed because of GEvent. The agent is only able to provide full snapshots when using the sync worker type of gunicorn. What you are seeing is a best attempt to track the greenlets with our current algorithm, which results in a partial call graph.

I would suggest trying the sync worker class for better results.

Thanks,
Kyle Furlong, Technical Lead (C++ and Dynamic Languages)

Found something helpful? Click the Accept as Solution button to help others find answers faster.

Liked something? Click the Thumbs Up button.

Ahmed.Dhanani · ‎04-08-2020

Hello Kyle,

Thanks for getting back on this. What I want to understand is that, why that particular node is showing a massive 27s there? According to my load test results, there are some requests that went to even 40s. But I am just unable to infer anything from this. I mean what exactly is the bottleneck here? Is it "predicting" that the overall request took 27s but somehow it is unable to identify where exactly these 27s went (I mean which component)?

If I try to put my question in simple words: should I infer that some component (outside of my application code) took around 27s before calling _handle_and_close_when_done and the later 300ms were because of some blocking IO, or should I infer that there could be some components in my code (Flask application) that are causing this massive 27s response time?

Thanks,

Ahmed

Kyle.Furlong · ‎04-08-2020

Hi Ahmed,

It's hard to say with the GEvent worker. Please try the sync worker and the snapshots should be more clear.

Thanks,
Kyle Furlong, Technical Lead (C++ and Dynamic Languages)

Found something helpful? Click the Accept as Solution button to help others find answers faster.

Liked something? Click the Thumbs Up button.

Cisco AppDynamics Community

Unable to understand transaction snapshot