cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Noopur.Tibdiwal
AppDynamics Team

This article discusses some of the most common issues faced when using Linux-based Private Synthetic Agent.

In this article:  

 

What are the prerequisites for debugging Linux private Synthetic Agent issues?

Make sure the deployment is done on officially supported PSA platforms, prerequisites and hardware requirements:

  • See Install the Private Synthetic Agent (Web and API Monitoring)  in the documentation, under End User Monitoring > Synthetic Monitoring 
  • Currently, the kernel architecture we support for installing PSA (Web Mon and API Mon) is x86-64, which is also referred to as x64, x86-64,AMD64, and Intel 64.

Back to TOC 

 

How do I capture PSA logs to further troubleshoot issues? 

To properly capture PSA logs, capture the pod details in separate files as instructed in the notes: 

kubectl get pods --namespace <namespace> > {YOUR_PREFERRED_PATH}/pods-status.txt 

kubectl get pods -o wide --all-namespaces > {YOUR_PREFERRED_PATH}/pods-status_wide.txt 

kubectl describe pod -n <namespace> <pod-name> > {YOUR_PREFERRED_PATH}/describe-pod-<pod-name>.txt 

kubectl logs <pod-name> --namespace <namespace> > {YOUR_PREFERRED_PATH}/logs-pod-<pod-name>.txt 


Notes:

  1. Replace <pod-name> and <namespace> with your existing values. 
  2. By default, <namespace> may have a value measurement. 
  3. To get all the <pod-name>, the first command will list them for you. 
  4. Make sure to capture the output of commands 3 and 4 for all <pod-name> per names listed in
    command 1, in separate files to avoid overwriting the same file.
     

Back to TOC 

 

What errors arise from unsupported Kubernetes versions?

Below are some of the errors reported when an unsupported K8s version is used.

Kubectl version | Insufficient resources for K8s | CrashLoopBackOff error |
Low resource allocation to Chrome API/Agent

Kubectl version

You can check the installed kubectl version using "kubectl version": 

 

 

INFO 1 --- [or-http-epoll-1] c.a.s.heimdall.client.ReactiveWebClient  : [34927359]  Response: Status: 500 

Cache-Control:no-store 
Pragma:no-cache 
Content-Type:application/json 
X-Content-Type-Options:nosniff 
X-Frame-Options:DENY 
X-XSS-Protection:1 ; mode=block 
Referrer-Policy:no-referrer 
content-length:226 

ERROR 1 --- [or-http-epoll-1] c.a.s.h.service.MeasurementService: Failed to submit measurement with id : 8b71c4f4-7541-41f8-9f6a-8e762502d117~02b75cbc-5aaf-43f6-9d1d-30e20a634977 

[SEVERE][main][TcpDiscoverySpi] Failed to get registered addresses from IP finder (retrying every 2000ms; change 'reconnectDelay' to configure the frequency of retries) [maxTimeout=0] 

class org.apache.ignite.spi.IgniteSpiException: Failed to retrieve Ignite pods IP addresses. 

 

 

 


The error below (and in the attached txt file) is also seen:

 

 

Warning  Unhealthy  23m (x4 over 25m)     kubelet            Readiness probe failed: Get "http://10.244.0.3:8080/ignite?cmd=probe": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Normal   Killing    23m (x2 over 25m)     kubelet            Container ignite failed liveness probe, will be restarted

Warning  Unhealthy  23m (x2 over 25m)     kubelet            Readiness probe failed: Get "http://10.244.0.3:8080/ignite?cmd=probe": EOF

Warning  Unhealthy  23m (x3 over 25m)     kubelet            Readiness probe failed: Get "http://10.244.0.3:8080/ignite?cmd=probe": dial tcp 10.244.0.3:8080: connect: connection refused

Normal   Pulled     23m (x2 over 25m)     kubelet            Container image "apacheignite/ignite:2.14.0-jdk11" already present on machine

Warning  Unhealthy  5m43s (x25 over 25m)  kubelet            Liveness probe failed: Get "http://10.244.0.3:8080/ignite?cmd=version": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Warning  BackOff    92s (x52 over 16m)    kubelet            Back-off restarting failed container ignite in pod synth-ignite-psa-0_ignite(1cba5f54-7723-4be4-a7ba-ce48fc6eacaf)

 

 

Back to Errors from Unsupported K8s versions Back to TOC 

Insufficient resources provided to Kubernetes

When not enough resources (CPU and Memory defined in values.yaml) are provided to the K8s env. (for example, when starting minikube). 

Events Type 
===========

 Reason 
===========

 Age 
=======

 From 
==========

 Message 
============

Warning 

FailedScheduling 

5m
(x863 over 3d3h)
 

default-scheduler 

0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod

To quickly check the current resources that minikube is running with, use the following: 

  cat ~/.minikube/config.json | grep "Memory\|CPUs"

NOTE | In case of no output, make sure to use config.json under the profile with which minikube was started.  

Back to Errors from Unsupported K8s versions Back to TOC 

CrashLoopBackOff image pulling error

When you see CrashLoopBackOff or Back-off pulling image error for Minikube-based PSA, update the values.yaml heimdall > pullPolicy to Never and re-deploy PSA. This fixes the error.  
 
For other platforms, please refer to our documentation for specific instructions by platform: Deploy the Web Monitoring PSA and API Monitoring PSA. 

Events: Type
=========== 

 Reason
======= 

Age
===========

 From 
=======

 Message 
===========

Normal 

BackOff 

3m7s
(x18915 over 3d3h)
 

kubelet 

Back-off pulling image “sum-heimdall:<<heimdall-tag>” 

Back to Errors from Unsupported K8s versions Back to TOC 

Low resource allocation to the Chrome/API Agent

If you're facing slower execution of the jobs/ High Session Duration to complete the jobs are mainly because of low resources allocated to the PSA, specifically to the Chrome/API agent.  
 
Try increasing the resources (CPU and memory) for the Chrome/API agent in values.yaml and re-deploy the PSA. 

 

 

chromeAgentResources:
min_cpu: "1"
max_cpu: "2"
min_mem: 1024Mi
max_mem: 8192Mi

 

 

Back to Errors from Unsupported K8s versions Back to TOC 

 

 How do I install PSA on a machine without an internet connection? 

  • Use the attached document “Install PSA with minikube on an offline machine.pdf.” 
  • You can use any machine with an active internet connection as your temporary machine. Build PSA components on that temporary machine and then export them to your target server machine without an active internet connection. 

PLEASE NOTE | The steps in the provided PDF have not been tested in-house by Cisco AppDynamics Support 

NOTE | Linux PSA version >= v22.9 doesn't need Postgres DB. Please refer to EUM > Synthetics >  Install the Private Synthetic Agent (Web and API Monitoring)  in our documentation. 

Back to TOC 

 

 How do I resolve a recurring ‘Test Agent Failed to Post Result’ error? 

If you're periodically or intermittently facing a ‘Test Agent Failed to Post Result’ error, redeploy PSA after updating values.yaml for Heimdall resources and Chrome agent resources (recommended): 

 

 

heimdallResources:
min_cpu : "3"
max_cpu: "3"
min_mem: 5Gi
max_mem: 5Gi

chromeAgentResources:
min_cpu: "1"
max_cpu: "2"
min_mem: 2048Mi
max_mem: 3072Mi

 

 

Back to TOC 

 

How do I resolve the 'DNS resolution failed (ERROR)'? 

If you're facing a job failing with the error below:
DNS resolution failed [ERROR] WebDriverException: unknown error: net::ERR_NAME_NOT_RESOLVED

Then,  

  1. Log into the Heimdall pod with the below command and see if you can ping the <url>: 
    kubectl exec -it <heimdall-pod-name> -n <namespace> -- /bin/bash


  2. After logging in to the Heimdall pod, please run the command below to check whether the pods are able to connect or not: 
    curl <url> 

NOTE | Curl command is available only on the Heimdall pod. Log into the Chrome agent pod using the below command to check/debug anything related to that pod: 
kubectl exec -it <chrome-pod-name> -n <namespace> -- /bin/sh

NOTE | In order to use any tool available for Alpine (Chrome agent pod), make sure to either remove the USER Block or add the particular install command in Chrome Agent DOCKERFILE , rebuild the image and redeploy the PSA. If you remove the USER block in Chrome Agent DOCKERFILE, the pod will be created with root permissions, and you can install any tool after logging in to the Chrome Agent pod. 

Back to TOC 

 

How do I resolve the error thrown when cluster-level permissions are missing?

The error below is thrown when cluster-level permissions are missing since PSA would need cluster-level permissions to function properly:

 

 

io.fabric8.kubernetes.client.KubernetesClientException: Operation: [list] for kind: [Pod] with name: [null] in namespace: [measurement] failed.

...

Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_362]

 

 

 


As PSA service accounts and roles are configured for cluster-level permissions to do certain operations on the Helm level. Having cluster-level permissions would imply that the Agent requires access to different namespaces in the cluster.
 

Refer to Create the Kubernetes Cluster in the documentation. The page makes note to create a cluster in the instructions, so the assumption is that you should have access to create a cluster. With only namespace level permissions, an individual won’t be able to create a cluster. 

Apply the steps below to fix the issue: 

TIP | If you want to permit only namespace-level permissions instead of cluster-level permissions, we suggest you use the role.yaml file attached below  

  1. Unpack Helm chart: 
    cd <Unzipped-PSA-directory> 
    tar xf sum-psa-heimdall.tgz 
  2. Use/Replace the attached role.yaml with sum-psa-heimdall/templates/role.yaml  
  3. Repack using the following: 
    helm package sum-psa-heimdall ​
  4. Finally, redeploy the PSA using the newly packed sum-psa-heimdall.tgz. 

Back to TOC 

 

How do I resolve a Heimdall log error?

If you see the error below in your Heimdall logs, try increasing the RAM on the PSA host machine, or decrease the memory assigned to minikube and values.yaml: 

 

 

2023-05-30 20:43:38.768 WARN 1 --- [ main] org.apache.ignite.internal.IgniteKernal: Nodes started on local machine require more than 80% of physical RAM what can lead to significant slowdown due to swapping (please decrease JVM heap size, data region size or checkpoint buffer size) [required=2262MB, available=5120MB]
[20:43:38] Nodes started on local machine require more than 80% of physical RAM that can lead to significant slowdown due to swapping (please decrease JVM heap size, data region size or checkpoint buffer size) [required=2262MB,

 

 

 

Back to TOC 

 

How do I resolve a Heimdall error on Docker-based PSA? 

For Docker-based PSA, make sure the "docker ps" command outputs both the Heimdall and ignite containers. 

To capture Heimdall logs, use the below: 

 

 

// Capture heimdall container logs using the <HEIMDALL_CONTAINER-ID> to heimdall.txt file, to get <HEIMDALL_CONTAINER-ID>, run "docker ps"

docker logs -n <last-n-lines> <HEIMDALL_CONTAINER-ID> > heimdall-<CONTAINER-ID>.txt

 

 

Back to TOC 

Version history
Last update:
‎10-06-2023 09:25 AM
Updated by:
Join Us On December 10
Learn how Splunk and AppDynamics are redefining observability


Register Now!

Observe and Explore
Dive into our Community Blog for the Latest Insights and Updates!


Read the blog here