cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Georgiy.Chigrichenko
AppDynamics Team (Retired)

The EUM Server and Events Service T-shirt Sizing Guide

 

Overview

The following tables contain data based on EUM Processor and Events Service load testing combinations under synthetic load. The tables have been normalized. 

 

Once the EUM traffic profile has been estimated, you can use these maximum load measurement results to establish which T-shirt size the EUM Processor and Events Service should be.

 

How do I account for multiple traffic types?

If you plan to consume more than one type of traffic, then you should add EUM loads using beacons, and add Events Service loads using Normalized Performance Events.

 

 

 

Normalized EUM Processor and Events Service Load Testing Tables

Use the following maximum load measurement results to establish the EUM Processor and Event Service T-shirt sizes.

 

Web Maximum Load Testing Results

EUM Server Size Events Svc. Size EUM beacons/min MAXIMUM EVENTS SERVICE Events/minute
Browser Record Browser Session Normalized Performance
Small Small 60K 60K  3K 60k browser records/min * 0.33 Normalized Performance Events/browser record event
3k browser session events/min * 5 Normalized Performance Events/session event = 35k Normalized Performance Events/min
Medium Medium 120K 120K 6K 60k browser records/min *
0.33 Normalized Performance Events/browser record event
60k browser records/min *
0.33 Normalized Performance Events/browser record event
Large Large 300K 300K 12K +300k browser records/min * 0.33 Normalized Performance Events/browser record event
12k Browser session events/min * 5 Normalized Performance Events/session event = 160k Normalized Performance Events/min

 

Mobile Maximum Load Testing Results

 EUM Server Size Events Svc. Size Max EUM MAXIMUM EVENTS SERVICE Events/minute
Mobile Snapshot Mobile Session Normalized Performance Events
Small Large 90K 70K 40K +70k Mobile snapshots/min *
0.33 Normalized Performance Events/
Mobile snapshot event
40k Mobile session events/min *
5 Normalized Performance Events/session event = 223.3k Normalized Performance Events/min
Medium XLarge 130K 110K 61K 110k Mobile snapshots/min * 0.33 Normalized Performance Events/Mobile snapshot event
61k Mobile session events/min * 5 Normalized Performance Events/session event
341.7k Normalized Performance Events/min
Large XXLarge 550K 1.35M 118K 3.5 Mobile snapshots/min * 0.33 Normalized Performance Events/Mobile snapshot event
118k Mobile session events/min * 5 Normalized Performance Events / session event = 1756.7k Normalized Performance Events/min

 

IoT Maximum Load Testing Results

EUM Server Size

Events Service Size

Maximum EUM

beacons/min

MAXIMUM EVENTS SERVICE

IoT Records/minute

Normalized Performance Events/minute

Small

Medium

110K

500K

500k IoT records/min *
0.33 Normalized Performance Events / IoT record event

166.7k Normalized Performance Events/min

Medium

Large

 300K

 600K

600k IoT records/min *
0.33 Normalized Performance Events/IoT record event

200k Normalized Performance Events/min

Large

XLarge

600K

1M

1M IoT records/min *
0.33 Normalized Performance Events/IoT record event
333.3k Normalized Performance Events/min

 

EUM Processor T-shirt Sizes

T-shirt Size
EUM Recommended Instance Type
EUM Processor JVM Heap Size

Small

4 core, 16GB RAM, disk 300GB 600IOPS (m4.xlarge)

11GB

Medium

8 core, 32GB RAM, disk 300GB 600IOPS (m4.2xlarge)

30GB

Large

16 core, 64GB RAM, disk 300GB 600IOPS (m4.4xlarge)

50GB

 

 

Node Configuration and System Architecture

The Events Service node network setup used in this example has speeds of >= 1 GBPs. Latencies are similar to a switched network, such that:

  • Average SSD latencies are < 1.2ms per read/write operation
  • Average nVME latencies are < 0.4ms per read/write operation

 

If SaaS deployment is not an option, consider splitting the deployment on the Application level into multiple accounts and multiple controllers.

 

AppDynamics strongly recommends using SSD-backed instances for Analytics—SAN is not recommended. This is because AppDynamics follows the official Elasticsearch hardware guidelines, configurations vary widely, and AppDynamics cannot guarantee that a particular SAN configuration is supported.

 

Finally, you should avoid network-attached storage (NAS). A NAS solution is often slower, displays larger latencies with a wider deviation in average latency, and is a single point of failure.

 

 

Events Service T-shirt Sizes

The following table shows the recommended number of nodes and node configuration for each T-shirt size.

See Stipulate throughput by license type in the Quick Method.

 

T-Shirt Size

Number

of Nodes

Normalized Performance Events/minute

Recommended

Node Configuration

SaaS

Recommended or Required?

X-Small

1

50000

4 core, SSD (ideally as nVME) or HDD 7,200 RPM

No

Proof of Concept, Dev, and Demo

Small

3

100000

4 core, SSD (ideally as nVME)

(i2.xlarge)

No

Medium

3

195761

8 core, SSD (ideally as nVME)

(i2.2xlarge)

No

Large

5

284000

8 core, SSD (ideally as nVME)

(i3.2xlarge)

Recommended sometimes

XLarge

10

438000

8-16 core, nVME 
(i3.2xlarge)

Recommended

XXLarge

20

600000

8-16 core, nVME

(i3.2xlarge)

Depending on query load and other factors that impact performance, larger nodes may be more suitable

Required

On-premises deployments of this size are not supported

The overall deployment should be structured as multiple smaller deployments, each with its own Controller

 

Once you determine your T-shirt size, refer to the non-virtual hardware specifications that correspond to the relevant EC2 instance size:

The Events Service should be on separate, dedicated server(s).

 

 

Shard Replication in Elasticsearch

We assume a replication factor of 1 for Elasticsearch. Performance tests show that there is an upper limit to the average CPU performance of a cluster. Due to the replication and synchronization of Elasticsearch node segments, the limit decreases as the number of nodes increases.

 

Risks and Benefits

Not enabling Elasticsearch replication has both risks and benefits:

Risks
    • The lack of redundancy highly increases the likelihood of data loss if any nodes in the Events Service cluster go down.

 

Benefits
    • CPU ingestion utilization decreases by approximately 55%
    • Data drive storage requirements decrease by approximately 50%

 

Data Replication

Since Elasticsearch builds in redundancy with replicas, there is no need for RAID configurations other than RAID 0. For this reason, Elasticsearch recommends using RAID 0 and increasing write throughput.

 

If RAID (1,3,5) replication is selected, AppDynamics does not provide support for disk performance or data integrity issues.

Version history
Last update:
‎08-10-2020 10:29 AM
Updated by:
Join Us On December 10
Learn how Splunk and AppDynamics are redefining observability


Register Now!

Observe and Explore
Dive into our Community Blog for the Latest Insights and Updates!


Read the blog here