Quantitative Evaluation and Analysis Tools

This page hosts a repository of peer-reviewed tools for quantitative system evaluation and analysis. The published tools have undergone a thorough review process by multiple independent experts to ensure high quality and relevance to the community. The review process covers important quality factors, including maturity, availability and usability.

Most tools include ready-to-use binaries, documentation, usage rules (incl. licenses) and source code. Each tool has dedicated maintainers that you can contact if you have problems using the tool.

As long as the tool providers allow for redistribution of the code, SPEC RG has no additional license requirements. SPEC RG simply redistributes the published tools with no modification. As part of the SPEC RG acceptance, SPEC RG may ask the authors to make changes or enhance certain features or aspects but SPEC RG itself does not make any changes. The code distributed by SPEC RG is on an “as is” basis and there are no warranties implicit or explicit with the code or its behavior. SPEC RG or SPEC is not liable for any issues that may arise due to the code.

SPEC RG welcomes new submissions of tools. In addition to stand-alone tools, extensions to existing tools are also solicited. More information on the submission process is available at the tools submission portal.

List of Tools

Tool Description
Alberta Workloads This is a collection of additional workloads for the SPEC CPU2017 Benchmark Suite. It contains both additional workloads for the benchmarks included in the suite and, for some benchmarks, scripts that can be used to generate additional workloads.
DiSL DiSL is a domain-specific language and framework for Java bytecode instrumentation. DiSL is inspired by AOP, but in contrast to mainstream AOP languages, it features an open join point model where any region of bytecodes can be selected as a join point (i.e., code location to be instrumented).
DynamicSpotter DynamicSpotter is a framework for measurement-based, automatic detection of software performance problems in Java-based enterprise software systems. DynamicSpotter combines the concepts of software performance anti-patterns with systematic experimentation.
Faban Faban is a facility for developing and running benchmarks. Faban supports multi-tier server benchmarks run across dozens of machines. It also supports developing and running a simple micro-benchmark targeting a single component.
FINCoS FINCoS is a set of benchmarking tools for load generation and performance measurement of event processing (EP) systems. It provides a flexible and neutral approach through which users, researchers and engineers can quickly run realistic performance tests on one or more EP platforms without having to code themselves load generation, performance measurement and event conversion routines.
inspectIT inspectIT is the open source APM solution to analyze the behavior of enterprise software applications and to diagnose problems. Software performance experts can monitor execution traces from applications under analysis and drill down into traces to isolate the root causes of performance problems.
Kieker Kieker is a framework for monitoring and analyzing the runtime behavior of distributed software systems. Focusing on application-level behavior, Kieker includes measurement probes for collecting timing and trace information from executions of software operations; but probes for sampling system-level measures, e.g., CPU utilization and memory usage, are included as well.
Libra Libra automatically evaluates forecasting methods in a diverse set of evaluation scenarios.
LibReDE LibReDE is a library for resource demand estimation. Resource demands are a common input parameter to stochastic performance models (e.g., Queueing Networks, or Queueing Petri Nets). LibReDE helps to determine resource demand values based on monitoring data from a system (e.g., CPU utilization, response time, or throughput).
LIKWID LIKWID is a set of command line tools and a library for the Linux operating system covering hardware performance profiling, system information/configuration and microbenchmarking for software developers, performance analysts and benchmarkers.
LIMBO LIMBO is an Eclipse-based tool for handling and instantiating load intensity models based on the Descartes Load Intensity Model (DLIM). LIMBO users can define variable arrival rates for a multitude of purposes, such as custom request time-stamp generation for benchmarking or the re-parametrization of request traces.
Mowgli Mowgli is an evaluation framework for cloud-hosted DBMS., supporting EC2 and OpenStack-based clouds and multiple NoSQL and NewSQL DBMS. Mowgli fully automates the evaluation process for the evaluation objectives performance, scalability, elasticity and availability.
SPA The Storage Performance Analyzer (SPA) is a software package containing the functionality for the systematic measurement, analysis and regression modeling specifically tailored for storage systems. SPA consists of a benchmark harness that coordinates and controls the execution of the included I/O benchmarks and a tailored analysis library used to process and evaluate the collected measurements.
TeaStore The TeaStore is a micro-service reference and test application for scientific and industrial benchmarks and tests.
Theodolite Theodolite is a framework for benchmarking the scalability of cloud-native applications in Kubernetes. Deployed as a Kubernetes Operator, Theodolite allows to run and design new benchmarks using existing Kubernetes tooling. Theodolite comes with a set of ready-to-use benchmarks for distributed stream processing engines.