cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Not a customer? Start a free trial

Click the Start a free trial link to start a 15-day SaaS trial of our product and join our community as a trial user. If you are an existing customer do not start a free trial.

AppDynamics customers and established members should click the sign in button to authenticate.

Dynamic Languages (Node.JS, Python, PHP, C/C++, Webserver Agent)

Race Condition in PHP Agent

ulrich.eckhardt
Adventurer

Race Condition in PHP Agent

I'm currently investigating a problem with the PHP Agent, which fails occasionally. The two errors that occur are

  • Segmentation fault (core dumped)
  • PHP Warning: Unsupported exit call type (missing delegate) in /srv/www/...

A colleague of mine already contacted AppDynamics, but unfortunately he is on holidays, so I can't follow up to that discussion. Their suggestion was to enable trace-level logging, which I did and which allowed me to capture some information concerning those issues.

 

The second case seems to be caused by a timeout from the service that provides the configuration (proxy agent?), "[config.ZMQConfigTransport] timed out (2000us) waiting for config. Actual wait time: 1846us". That's pretty short, in particular since the machine only runs on a single CPU.

 

The first case is harder. Comparing a successful with a failed call doesn't turn up any significant differences. The last line in the log is "[agent] started API exit call of type EXIT_HTTP", after that the segfault kills the process. The topmost frames are in libstdc++, std::string::assign() in particular, which I found out after running the program in a debugger.

 

Some system details...

  • Linux 64 bit on an AWS-hosted VM
  • AppDynamics versions 4.2 and 4.3 both show the same issue.
  • No webserver installed, this is a pure CLI application. It happened on a production machine with a webserver, too, after which we isolated the problem to a single machine.
  • Trace-level logging seems to reduce the likelihood of the problem occurring.

 

So... any suggestions what to try next? Is there even a bugtracking system where I could search for similar issues?

 

Thanks everyone!

 

Uli

 

By replying you agree to the Terms and Conditions of the AppDynamics Community.
Race Condition in PHP Agent
5 REPLIES 5
Ayush.Ghosh
AppDynamics Team

Hi,

 

About the two issue you are facing 

  • Segmentation fault (core dumped)
    • Do you have a core dump ?
    • Is OpCache enabled ? If yes, could you try disabling that & check if the same can be reproduced.
  • PHP Warning: Unsupported exit call type (missing delegate) in /srv/www/...
    • Are you getting this continiously throughout the application lifecycle? 
    • While the application is started, there is a delay to get the config from Controller. So this is when you get this error. This should go away when the agent gets the config from the controller.
    • The 2000us wait to get config is between the PHP process & the Proxy. Thus for an IPC this is not too short TTL.

 

Is it possible to share the trace level logs & core dump if any. Then I would share an SFTP credentials for same.

 

Please don't attach them here as it might contain sensitive information.

 

Thanks

Ayush

 

ulrich.eckhardt
Adventurer

Hello Ayush,

 

Concerning the segmentation fault:

  • I don't have a core file yet, but I could surely produce one.
  • OpCache was not enabled ("opcache.enable_cli => Off => Off"). Even when disabling the module as a whole, I can still produce the segfaults.

Concerning the missing delegate:

  • The error only occurs sporadically. According to the logs, the response usually comes after less than 2ms. Often, the timestamp doesn't show any delay at all, so it's less than 1ms then.
  • Concerning the timeout, consider a single-CPU machine. When the request is sent, the scheduler needs to switch to the proxy. If any other processes are in the ready state, those process will get a timeslice first, which can easily exceed a millisecond. The same applies to a multicore machine under load, btw. Is there a way to tweak that timeout? Also, what are the side-effects of the timeout? If it's just one sample not being reported and that doesn't happen often, I could live with that or maybe tweak the scheduler settings.

I can share the trace-level logs and probably also the core dump.

 

Thanks for your help!

 

Uli

ulrich.eckhardt
Adventurer

Short update: Concerning the segfaults, a support ticket was created and the AppDynamics team were already able to reproduce the issue, so a fix shouldn't be too far.

 

itsystems
Adventurer

Hi!

 

We are facing with the same problem. I hoped that with the lastest version of appdynamics PHP agent,  this problem was solved but it doesn't looks like.

 

We are getting segmentation fault errors on apache using appDynamics PHP 4.4.3 agent (latest) and PHP 5.4

 

Any help?

Thanks!

 

ulrich.eckhardt
Adventurer

Hi!

 

At the moment, we are not using AppDynamics on the systems that formerly were impacted by that bug. Also, since the the dev team of AppDynamics was able to reproduce the bug and fix it since it was filed, I believe that it shouldn't be a concern. Maybe it's a different bug with similar symptoms.

 

Two notes though:

  • 4.4.3 is not an AppDynamics version number, as they don't use a triple but a quadruple (like e.g. 4.2.13.1). Also, it might be relevant which versions of both the machine agent as well as the PHP agent were installed.
  • PHP 5.4 is something I personally would refuse to support in any way. At the very least, PHP 5.6 should be used. Still, check out the announcements at PHP's website, their support for versions before 7.1 ends this year.

Good luck!

 

Uli