DeLag: Detecting Latency Degradation Patterns in Service-based Systems
Replication package of the work "DeLag: Detecting Latency Degradation Patterns in Service-based Systems".
Requirements
- Python 3.6
- Java 8
- Apache Spark 2.3.1 (set
$SPARK_HOME
env variable with the folder path)) - Elasticsearch for Spark 2.X 7.6.0 (set
$ES_SPARK
env variable with the jar path) - Maven 3.6.0 (only for datasets generation)
- Docker 18.03 (only for datasets generation)
Use the following command to install Python dependencies
pip install --upgrade pip
pip install -r requirements.txt
The generation of datasets and the experimentation of techniques were performed on a dual Intel Xeon CPU E5-2650 v3 at 2.30GHz, totaling 40 cores and 80GB of RAM. We recommend to run the scripts of this replication package on a machine with similar specs.
Datasets
The datasets
folder contains the datasets of traces used in the evaluation (in parquet format). Each row of each dataset represents a request and contains:
traceId
: the ID of the request:[requestLatency]
: the overall latency of the request. It is represented by the columnts-travel-service_queryInfo
in the Train-Ticket case study and by the columnHomeControllerHome
in the E-Shopper case study.experiment
: if equal to0
(resp.1
) the request is affected by the ADC (resp. ) otherwise is not affected by any ADCs.[RPC]
: the cumulative execution time of[RPC]
within the request.
Datasets generation
The datasets-generation
folder contains the bash scripts used to generate the datasets used in the evaluation.
Techniques
The techniques
folder contains the implementations of DeLag, CoTr, KrSa and DeCaf. In the following you can find the main Python classes used to implement each technique:
DeLag
: classGeneticRangeAnalysis
CoTr
: classesRangeAnalysis
andGA
KrSa
: classesRangeAnalysis
andBranchAndBound
DeCaf
: classDeCaf
.
Experiments
The experiments
folder contains the Python scripts used to execute DeLag and baselines techniques on the generated datasets.
Results
The results
folder contains the results of our experimentation. Each row of each csv file represents a run of a particural technique on a dataset and contains:
exp
: the dataset ID.algo
: the technique experimented. The notation used to indicate each techique is described below:gra
: DeLag - DeLag: Detecting Latency Degradation Patterns in Service-based Systemsbnb
: KrSa - Understanding Latency Variations of Black Box Services (WWW 2013)ga
: CoTr - Detecting Latency Degradation Patterns in Service-based Systems (ICPE 2020)decaf
DeCaf - DeCaf: Diagnosing and Triaging Performance Issues in Large-Scale Cloud Services (ICSE 2020)kmeans
: K-meanshierarchical
: HC - Hierachical clustering
trial
: the ID of the run (techniques may be repeated multiple times on a dataset to mitigate result variabilility)precision
: effectiveness measure - Precision ()recall
: effectiveness measure - Recall ()fmeasure
: effectiveness measure - F1-score ()time
: execution time in seconds
Scripts
The scripts
folder contains the Python scripts used to generate the figures and tables of the paper.
Systems
The systems
folder contains the two case study systems.