Quantitative Evaluation and Analysis Tools

This page hosts a repository of peer-reviewed tools for quantitative system evaluation and analysis. The published tools have undergone a thorough review process by multiple independent experts to ensure high quality and relevance to the community. The review process covers important quality factors, including maturity, availability and usability.

Most tools include ready-to-use binaries, documentation, usage rules (incl. licenses) and source code. Each tool has dedicated maintainers that you can contact if you have problems using the tool.

As long as the tool providers allow for redistribution of the code, SPEC RG has no additional license requirements. SPEC RG simply redistributes the published tools with no modification. As part of the SPEC RG acceptance, SPEC RG may ask the authors to make changes or enhance certain features or aspects but SPEC RG itself does not make any changes. The code distributed by SPEC RG is on an “as is” basis and there are no warranties implicit or explicit with the code or its behavior. SPEC RG or SPEC is not liable for any issues that may arise due to the code.

SPEC RG welcomes new submissions of tools. In addition to stand-alone tools, extensions to existing tools are also solicited. More information on the submission process is available at the tools submission portal.

List of Tools

Tool	Description
Alberta Workloads	This is a collection of additional workloads for the SPEC CPU2017 Benchmark Suite. It contains both additional workloads for the benchmarks included in the suite and, for some benchmarks, scripts that can be used to generate additional workloads.
DiSL	DiSL is a domain-specific language and framework for Java bytecode instrumentation. DiSL is inspired by AOP, but in contrast to mainstream AOP languages, it features an open join point model where any region of bytecodes can be selected as a join point (i.e., code location to be instrumented).
DynamicSpotter	DynamicSpotter is a framework for measurement-based, automatic detection of software performance problems in Java-based enterprise software systems. DynamicSpotter combines the concepts of software performance anti-patterns with systematic experimentation.
Faban	Faban is a facility for developing and running benchmarks. Faban supports multi-tier server benchmarks run across dozens of machines. It also supports developing and running a simple micro-benchmark targeting a single component.
FINCoS	FINCoS is a set of benchmarking tools for load generation and performance measurement of event processing (EP) systems. It provides a flexible and neutral approach through which users, researchers and engineers can quickly run realistic performance tests on one or more EP platforms without having to code themselves load generation, performance measurement and event conversion routines.
HeteroBench	HeteroBench is a comprehensive benchmark suite designed to evaluate heterogeneous systems with various accelerators, including CPUs, GPUs (NVIDIA, AMD, Intel), and FPGAs (AMD). It features multi-kernel applications spanning domains like image processing, machine learning, numerical computation, and physical simulation. HeteroBench aims to assist users in assessing performance, optimizing hardware usage, and facilitating decision-making in HPC environments.
inspectIT	inspectIT is the open source APM solution to analyze the behavior of enterprise software applications and to diagnose problems. Software performance experts can monitor execution traces from applications under analysis and drill down into traces to isolate the root causes of performance problems.
Kieker	Kieker provides observability in production environments, and its tracing yields only a low performance overhead. Use cases in teaching, research, and practice include the following areas: performance evaluation, self-adaptation control (e.g., online capacity management), failure detection and diagnosis, simulation (replaying workload traces for driving simulations; measurement and logging of simulation data; analysis of simulation results), and software re-engineering (e.g., extraction of architectural and usage models).
Libra	Libra automatically evaluates forecasting methods in a diverse set of evaluation scenarios.
LibReDE	LibReDE is a library for resource demand estimation. Resource demands are a common input parameter to stochastic performance models (e.g., Queueing Networks, or Queueing Petri Nets). LibReDE helps to determine resource demand values based on monitoring data from a system (e.g., CPU utilization, response time, or throughput).
LIKWID	LIKWID is a set of command line tools and a library for the Linux operating system covering hardware performance profiling, system information/configuration and microbenchmarking for software developers, performance analysts and benchmarkers.
LIMBO	LIMBO is an Eclipse-based tool for handling and instantiating load intensity models based on the Descartes Load Intensity Model (DLIM). LIMBO users can define variable arrival rates for a multitude of purposes, such as custom request time-stamp generation for benchmarking or the re-parametrization of request traces.
Mowgli	Mowgli is an evaluation framework for cloud-hosted DBMS., supporting EC2 and OpenStack-based clouds and multiple NoSQL and NewSQL DBMS. Mowgli fully automates the evaluation process for the evaluation objectives performance, scalability, elasticity and availability.
SPA	The Storage Performance Analyzer (SPA) is a software package containing the functionality for the systematic measurement, analysis and regression modeling specifically tailored for storage systems. SPA consists of a benchmark harness that coordinates and controls the execution of the included I/O benchmarks and a tailored analysis library used to process and evaluate the collected measurements.
TeaStore	The TeaStore is a micro-service reference and test application for scientific and industrial benchmarks and tests.
Theodolite	Theodolite is a framework for benchmarking the scalability of cloud-native applications in Kubernetes. Deployed as a Kubernetes Operator, Theodolite allows to run and design new benchmarks using existing Kubernetes tooling. Theodolite comes with a set of ready-to-use benchmarks for distributed stream processing engines.