Highlighted Publications

Selected joint member publications as well as technical reports published by SPEC RG are available at the publication page.

Group Publications

In the following, we list a selection of relevant publications by members of the RG Power working group and previous publications with the OSG Power Subcommittee.

Norbert Schmitt, Supriya Kamthania, Nishant Rawtani, Luis Mendoza, Klaus-Dieter Lange, Samuel Kounev. Energy-Efficiency Comparison of Common Sorting Algorithms. In 29th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 1-8, 2021.
[ bibtex | abstract | DOI ]
Keywords: Energy Efficiency, Sorting Algorithms, Power Consumption, Software Efficiency, Cloud, Data Center, Green Computing, Performance.

With the rising demand for information technology comes an increase in energy consumption to power it. Namely, cloud computing is constantly growing, in part due to a rising number of devices using the cloud to provide certain functionality. This growth leads to an increase in energy consumption in data centers and is estimated to climb to over 1PWh in 2030. Hardware manufacturers counter the rising demand for energy in cloud data centers by providing techniques to make servers more energy-efficient. However, the advances cannot fully compensate for the growth. To further increase energy efficiency, software needs to be addressed as well but is often neglected by the developers.In this paper, we compare six sorting algorithms, a common task in most programs, against each other in terms of energy efficiency to allow developers to select the best solution to their problem. We selected well-known algorithms in two variants, and two implementation languages, C and Python. We ran each algorithm on two, out of four, state-of-the-art server systems with different CPUs.
@inproceedings{DBLP:conf/mascots/SchmittKRMLK21,
  author    = {Norbert Schmitt and
               Supriya Kamthania and
               Nishant Rawtani and
               Luis Mendoza and
               Klaus{-}Dieter Lange and
               Samuel Kounev},
  title     = {Energy-Efficiency Comparison of Common Sorting Algorithms},
  booktitle = {29th International Symposium on Modeling, Analysis, and Simulation
               of Computer and Telecommunication Systems, {MASCOTS} 2021, Houston,
               TX, USA, November 3-5, 2021},
  pages     = {1--8},
  publisher = {{IEEE}},
  year      = {2021},
  url       = {https://doi.org/10.1109/MASCOTS53633.2021.9614299},
  doi       = {10.1109/MASCOTS53633.2021.9614299},
}
    
Norbert Schmitt, Klaus-Dieter Lange, Sanjay Sharma, Aaron Cragin, David Reiner, Samuel Kounev. SPEC - Spotlight on the International Standards Group (ISG). In Companion of the ACM/SPEC International Conference on Performance Engineering, pp. 167-168, 2021.
[ bibtex | abstract | DOI ]

The driving philosophy for the Standard Performance Evaluation Corporation (SPEC) is to ensure that the marketplace has a fair and useful set of metrics to differentiate systems, by providing standardized benchmark suites and international standards. This poster-paper gives an overview of SPEC with a focus on the newly founded International Standards Group (ISG).
     @inproceedings{DBLP:conf/wosp/SchmittLSCRK21,
  author    = {Norbert Schmitt and
               Klaus{-}Dieter Lange and
               Sanjay Sharma and
               Aaron Cragin and
               David Reiner and
               Samuel Kounev},
  editor    = {Johann Bourcier and
               Zhen Ming (Jack) Jiang and
               Cor{-}Paul Bezemer and
               Vittorio Cortellessa and
               Daniele Di Pompeo and
               Ana Lucia Varbanescu},
  title     = {{SPEC} - Spotlight on the International Standards Group {(ISG)}},
  booktitle = {{ICPE} '21: {ACM/SPEC} International Conference on Performance Engineering,
               Virtual Event, France, April 19-21, 2021, Companion Volume},
  pages     = {167--168},
  publisher = {{ACM}},
  year      = {2021},
  url       = {https://doi.org/10.1145/3447545.3451171},
  doi       = {10.1145/3447545.3451171},
}
    
Norbert Schmitt, Klaus-Dieter Lange, Sanjay Sharma, Nishant Rawtani, Carl Ponder, Samuel Kounev. The SPECpowerNext Benchmark Suite, its Implementation and New Workloads from a Developer's Perspective. In Proceedings of the ACM/SPEC International Conference on Performance Engineering, pp. 225-232, 2021.
[ bibtex | abstract | DOI ]

Innovation needs a competitive and fair playing field on which products can be compared and informed choices can be made. Standard benchmarks are a necessity to create such a level playing field among competitors in the server market for more energy-efficient servers. That, in turn, motivates their engineers to design more energy-efficient hardware. The SPECpower_ssj 2008 benchmark drove the increase of server energy efficiency by 113 times for single CPU servers, or 19 times on average. Yet, with added functionality and load, they are expected to consume a rising amount of energy. Additionally, server usage in data centers has changed over time with new application types. To continue the effort of increasing server energy efficiency, a new version, SPECpowerNext, is under development. In this work, after a short introduction to SPECpower_ssj 2008, we present the new implementation of SPECpowerNext together with the standardized way to collect server information in heterogeneous data centers. We also give insight, including preliminary measurements, into two of SPECpowerNext new workloads, the Wiki and the APA workload, in addition to an overview of both workloads.
     

@inproceedings{DBLP:conf/wosp/SchmittLSRPK21,
  author    = {Norbert Schmitt and
               Klaus{-}Dieter Lange and
               Sanjay Sharma and
               Nishant Rawtani and
               Carl Ponder and
               Samuel Kounev},
  editor    = {Johann Bourcier and
               Zhen Ming (Jack) Jiang and
               Cor{-}Paul Bezemer and
               Vittorio Cortellessa and
               Daniele Di Pompeo and
               Ana Lucia Varbanescu},
  title     = {The SPECpowerNext Benchmark Suite, its Implementation and New Workloads
               from a Developer's Perspective},
  booktitle = {{ICPE} '21: {ACM/SPEC} International Conference on Performance Engineering,
               Virtual Event, France, April 19-21, 2021},
  pages     = {225--232},
  publisher = {{ACM}},
  year      = {2021},
  url       = {https://doi.org/10.1145/3427921.3450239},
  doi       = {10.1145/3427921.3450239},
}
    
Norbert Schmitt, James Bucek, John Beckett, Aaron Cragin, Klaus-Dieter Lange, Samuel Kounev. Performance, Power, and Energy-Efficiency Impact Analysis of Compiler Optimizations on the SPEC CPU 2017 Benchmark Suite. In 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC), pp. 292-301, 2020.
[ bibtex | abstract | DOI ]
Keywords: energy efficiency, compiler optimizations, power consumption, software efficiency, power capping, cloud, datacenter, green computing, sustainable computing, SPEC CPU 2017, performance, benchmark.

The growth of cloud services leads to more and more data centers that are increasingly larger and consume considerable amounts of power. To increase energy efficiency, both the actual server equipment and the software must become more energy efficient. Software has a major impact on hardware utilization levels, and subsequently, the energy efficiency. While energy efficiency is often seen as identical to performance, we argue that this may not be necessarily the case. A sizable amount of energy could be saved, increasing energy efficiency by leveraging compiler optimizations but at the same time impacting performance and power consumption over time. We analyze the SPEC CPU 2017 benchmark suite with 43 benchmarks from different domains, including integer and floating-point heavy computations on a state-of-the-art server system for cloud applications. Our results show that power consumption displays more stable behavior if less compiler optimizations are used and also confirmed that performance and energy efficiency are different optimizations goals. Additionally, compiler optimizations possibly could be used to enable power capping on a software level and care must be taken when selecting such optimizations.
     

@inproceedings{DBLP:conf/ucc/SchmittBBCLK20,
  author    = {Norbert Schmitt and
               James Bucek and
               John Beckett and
               Aaron Cragin and
               Klaus{-}Dieter Lange and
               Samuel Kounev},
  title     = {Performance, Power, and Energy-Efficiency Impact Analysis of Compiler
               Optimizations on the {SPEC} {CPU} 2017 Benchmark Suite},
  booktitle = {13th {IEEE/ACM} International Conference on Utility and Cloud Computing,
               {UCC} 2020, Leicester, United Kingdom, December 7-10, 2020},
  pages     = {292--301},
  publisher = {{IEEE}},
  year      = {2020},
  url       = {https://doi.org/10.1109/UCC48980.2020.00047},
  doi       = {10.1109/UCC48980.2020.00047},
}
    
Norbert Schmitt, James Bucek, Klaus-Dieter Lange, Samuel Kounev. Energy Efficiency Analysis of Compiler Optimizations on the SPEC CPU 2017 Benchmark Suite. In Companion of the ACM/SPEC International Conference on Performance Engineering, pp. 38-41, 2020.
[ bibtex | abstract | DOI ]

The growth of cloud services leads to more and more data centers that are increasingly larger and consume considerable amounts of power. To increase energy efficiency, both the actual server equipment and the software themselves must become more energy-efficient. It is the software that controls the hardware to a considerable degree. In this work-in-progress paper, we present a first analysis of how compiler optimizations can influence energy efficiency. We base our analysis on workloads of the SPEC CPU 2017 benchmark. With 43 benchmarks from different domains, including integer and floating-point heavy computations executed on a state-of-the-art server system for cloud applications, SPEC CPU 2017 offers a representative selection of workloads.
@inproceedings{DBLP:conf/wosp/SchmittBLK20,
  author    = {Norbert Schmitt and
               James Bucek and
               Klaus{-}Dieter Lange and
               Samuel Kounev},
  editor    = {Jos{\'{e}} Nelson Amaral and
               Anne Koziolek and
               Catia Trubiani and
               Alexandru Iosup},
  title     = {Energy Efficiency Analysis of Compiler Optimizations on the {SPEC}
               {CPU} 2017 Benchmark Suite},
  booktitle = {Companion of the 2020 {ACM/SPEC} International Conference on Performance
               Engineering, {ICPE} 2020, Edmonton, AB, Canada, April 20-24, 2020},
  pages     = {38--41},
  publisher = {{ACM}},
  year      = {2020},
  url       = {https://doi.org/10.1145/3375555.3383759},
  doi       = {10.1145/3375555.3383759},
}
    
Jóakim von Kistowski, Klaus-Dieter Lange, Jeremy A Arnold, John Beckett, Hansfried Block, Mike Tricker, Sanjay Sharma, Johann Pais, Samuel Kounev. Measuring and rating the energy-efficiency of servers. In Future Generation Computer Systems - Volume 100, pp. 579-589, 2019.
[ bibtex | abstract | DOI ]
Keywords: Server, Energy-efficiency, Benchmarking, Measurement methodology, Metric.

Data centers and servers consume a significant amount of electrical power. As future data centers increase in size, it becomes increasingly important to be able to select servers based on their energy efficiency. Rating the efficiency of servers is challenging, as it depends on the software executed on that server and on its load profile. To account for this, a measurement and rating methodology for servers must make use of both realistic and varying workloads. Existing energy-efficiency benchmarks either run standardized application loads at multiple load levels or multiple micro-benchmarks at full load. This does not enable a full analysis of system behavior as the energy efficiency for different kinds of software at low load levels remains unknown. This article introduces a measurement methodology and metrics for energy-efficiency rating of servers. The methodology defines system setup and instrumentation for reproducible results. It is designed to use multiple, specifically chosen workloads at different load levels for a full system characterization. All partial results are aggregated in a final metric that can be used by regulators to define server-labeling criteria. The methodology and workloads have been implemented in the standardized Server Efficiency Rating Tool. We show the applicability and use of the measurement methodology specifically considering its reproducibility, fairness, and relevance. A reproducibility analysis shows that efficiency scores vary with a maximum coefficient of 1.04% for repeated experiments. In addition, we evaluate the proposed metrics by investigating their energy-efficiency rating on a set of 385 different servers and show the relevance of the selected workloads by analyzing relationships to real-world applications.
@article{DBLP:journals/fgcs/KistowskiLABBTS19,
  author    = {J{\'{o}}akim von Kistowski and
               Klaus{-}Dieter Lange and
               Jeremy A. Arnold and
               John Beckett and
               Hansfried Block and
               Michael G. Tricker and
               Sanjay Sharma and
               Johann Pais and
               Samuel Kounev},
  title     = {Measuring and rating the energy-efficiency of servers},
  journal   = {Future Gener. Comput. Syst.},
  volume    = {100},
  pages     = {579--589},
  year      = {2019},
  url       = {https://doi.org/10.1016/j.future.2019.05.050},
  doi       = {10.1016/j.future.2019.05.050},
}
    
Jóakim von Kistowski, Johann Pais, Tobias Wahl, Klaus-Dieter Lange, Hansfried Block, John Beckett, Samuel Kounev. Measuring the Energy Efficiency of Transactional Loads on GPGPU. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, pp. 219-230, 2019.
[ bibtex | abstract | DOI ]

General Purpose Graphics Processing Units (GPGPUs) are becoming more and more common in current servers and data centers, which in turn consume a significant amount of electrical power. Measuring and benchmarking this power consumption is important as it helps with optimization and selection of these servers. However, benchmarking and comparing the energy efficiency of GPGPU workloads is challenging as standardized workloads are rare and standardized power and efficiency measurement methods and metrics do not exist. In addition, not all GPGPU systems run at maximum load all the time. Systems that are utilized in transactional, request driven workloads, for example, can run at lower utilization levels. Existing benchmarks for GPGPU systems primarily consider performance and are intended only to run at maximum load. They do not measure performance or energy efficiency at other loads. In turn, server energy-efficiency benchmarks that consider multiple load levels do not address GPGPUs. This paper introduces a measurement methodology for servers with GPGPU accelerators that considers multiple load levels for transactional workloads. The methodology also addresses verifiability of results in order to achieve comparability of different device solutions. We analyze our methodology on three different systems with solutions from two different accelerator vendors. We investigate the efficacy of different methods of load levels scaling and our methodology's reproducibility. We show that the methodology is able to produce consistent and reproducible results with a maximum coefficient of variation of 1.4% regarding power consumption.
@inproceedings{DBLP:conf/wosp/KistowskiPWLBBK19,
  author    = {J{\'{o}}akim von Kistowski and
               Johann Pais and
               Tobias Wahl and
               Klaus{-}Dieter Lange and
               Hansfried Block and
               John Beckett and
               Samuel Kounev},
  editor    = {Varsha Apte and
               Antinisca Di Marco and
               Marin Litoiu and
               Jos{\'{e}} Merseguer},
  title     = {Measuring the Energy Efficiency of Transactional Loads on {GPGPU}},
  booktitle = {Proceedings of the 2019 {ACM/SPEC} International Conference on Performance
               Engineering, {ICPE} 2019, Mumbai, India, April 7-11, 2019},
  pages     = {219--230},
  publisher = {{ACM}},
  year      = {2019},
  url       = {https://doi.org/10.1145/3297663.3309667},
  doi       = {10.1145/3297663.3309667},
}
    
Jóakim von Kistowski, Klaus-Dieter Lange, Jeremy A Arnold, Sanjay Sharma, Johann Pais, Hansfried Block. Measuring and Benchmarking Power Consumption and Energy Efficiency. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, pp. 57-65, 2018.
[ bibtex | abstract | DOI ]

Energy efficiency is an important quality of computing systems. Researchers try to analyze, model, and predict the energy efficiency and power consumption of systems. Such research requires energy efficiency and power measurements, as well as measurement methodologies. Many such methodologies exist. However, they do not account for multiple load levels and workload combinations. In this paper, we introduce the SPEC power methodology and the tools implementing this methodology. We discuss the PTDaemon power measurement tool and the Chauffeur power benchmarking framework. We present the SPEC Server Efficiency Rating Tool (SERT), the workloads it contains and introduce the industry-standard compute efficiency benchmark SPECpower_ssj2008. Finally, we show some examples of how the SPEC power tools have been used in research so far.
@inproceedings{DBLP:conf/wosp/KistowskiLASPB18,
  author    = {J{\'{o}}akim von Kistowski and
               Klaus{-}Dieter Lange and
               Jeremy A. Arnold and
               Sanjay Sharma and
               Johann Pais and
               Hansfried Block},
  editor    = {Katinka Wolter and
               William J. Knottenbelt and
               Andr{\'{e}} van Hoorn and
               Manoj Nambiar},
  title     = {Measuring and Benchmarking Power Consumption and Energy Efficiency},
  booktitle = {Companion of the 2018 {ACM/SPEC} International Conference on Performance
               Engineering, {ICPE} 2018, Berlin, Germany, April 09-13, 2018},
  pages     = {57--65},
  publisher = {{ACM}},
  year      = {2018},
  url       = {https://doi.org/10.1145/3185768.3185775},
  doi       = {10.1145/3185768.3185775},
}
    
J von Kistowski, KD Lange, JA Arnold, H Block, G Darnell, J Beckett, M Tricker. The SERT 2 Metric and the Impact of Server Configuration. In SPEC Technical report, September 2017 - last update: 31 March, 2021.
[ abstract | pdf ]

The SERT Suite is an industry standard for measuring and analyzing the energy efficiency of servers. It measures server efficiency using multiple workloads, which in turn consist of small scale mini-workloads called worklets. Using multiple worklets enables the SERT suite to holistically explore the behavior of many different systems and enables thorough analysis. However, multiple workloads also result in multiple energy efficiency scores. This document introduces the single SERT 2 metric (SERT 2 Efficiency Score), which can be used to easily compare systems using a single number. This document explains how the SERT 2 metric is calculated. It also illustrates how a system under test (SUT) configuration and changes to this configuration can impact the SERT 2 Efficiency Score and demonstrates this using a running example.
    
Jóakim von Kistowski, Maximilian Deffner, Jeremy A. Arnold, Klaus-Dieter Lange, John Beckett, and Samuel Kounev. Autopilot: Enabling easy Benchmarking of Workload Energy Efficiency (Demonstration Paper). In Proceedings of the 8th ACM/SPEC International Conference on Performance Engineering (ICPE 2017), L'Aquila, Italy, April 2017. ACM, New York, NY, USA. April 2017, Best Demo Award.
[ bibtex | abstract | pdf ]
Keywords: Benchmarking, Power, Variation, SPEC, Workloads, Energy Efficiency, Load level, Deployment, Development.

Benchmarking of energy efficiency is important as it helps researchers, customers, and developers to evaluate and compare the energy efficiency of software and hardware solutions. Developing and deploying energy-efficiency benchmarking workloads are challenging tasks, as work must be able to be executed in a power measurement environment using an energy-efficiency measurement methodology. The existing SPEC Chauffeur Worklet Development Kit (WDK) enables the development and use of custom workloads (called worklets) within a standardized power measurement methodology. However, it features no integration in development environments, making building and deployment of workloads challenging. We address this challenge by proposing Autopilot, a plugin for the Eclipse IDE. Autopilot enables fast and easy building and deployment of a workload under development on a system for testing. It also enables benchmark execution directly from the development environment.
@inproceedings{KiDeArLaBeKo2017-ICPE-Autopilot,
  author = {J{\'o}akim von Kistowski and Maximilian Deffner and Jeremy A. Arnold and Klaus-Dieter Lange and John Beckett and Samuel Kounev},
  title = {{Autopilot: Enabling easy Benchmarking of Workload Energy Efficiency}},
  titleaddon = {{(Demonstration Paper)}},
  year = {2017},
  booktitle = {Proceedings of the 8th ACM/SPEC International Conference on Performance Engineering (ICPE 2017)},
  location = {L'Aquila, Italy},
  month = {April},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {Benchmarking, Power, Variation, SPEC, Workloads, Energy Efficiency, Load level, Deployment, Development},
  note = {Best Demo Award},
  pdf = {https://se2.informatik.uni-wuerzburg.de/pa/publications/download/paper/1171.pdf}
}
						
Jóakim von Kistowski, Hansfried Block, John Beckett, Cloyce Spradling, Klaus-Dieter Lange, and Samuel Kounev. Variations in CPU Power Consumption. In Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE 2016), Delft, the Netherlands, March 2016. ACM, New York, NY, USA. March 2016.
[ bibtex | abstract | pdf ]
Keywords: Benchmarking, CPU, Power, Variation, SPEC, SERT, Workloads, Energy Efficiency, Metrics, Load level, Utilization.

Experimental analysis of computer systems' power consumption has become an integral part of system performance evaluation, efficiency management, and model-based analysis. As with all measurements, repeatability and reproducibility of power measurements is a major challenge. Nominally identical systems can have different power consumption running the same workload under otherwise identical conditions. This behavior can also be observed for individual system components. Specifically, CPU power consumption can vary amongst different samples of nominally identical CPUs. This in turn has a significant impact on the overall system power, considering that a system's processor is the largest and most dynamic power consumer of the overall system. The concrete impact of CPU sample power variations is unknown, as comprehensive studies about differences in power consumption for nominally identical systems are currently missing. We address this lack of studies by conducting measurements on four different processor types from two different architectures. For each of these types, we compare up to 30 physical processor samples with a total sum of 90 samples over all processor types. We analyze the variations in power consumption for the different samples using six different workloads over five load levels. Additionally, we analyze how these variations change for different processor core counts and architectures. The results of this paper show that selection of a processor sample can have a statistically significant impact on power consumption. With no correlation to performance, power consumption for nominally identical processors can differ as much as 29.6% in idle and 19.5% at full load. We also show that these variations change over different architectures and processor types.
@inproceedings{KiBlBeSpLaKo2016-ICPE-PowerVariation,
author = {J\'{o}akim von Kistowski and Hansfried Block and John Beckett and Cloyce Spradling and Klaus-Dieter Lange and Samuel Kounev},
abstract = {{Experimental analysis of computer systems' power consumption has become an integral part of system performance evaluation, efficiency management, and model-based analysis. As with all measurements, repeatability and reproducibility of power measurements is a major challenge. Nominally identical systems can have different power consumption running the same workload under otherwise identical conditions. This behavior can also be observed for individual system components. Specifically, CPU power consumption can vary amongst different samples of nominally identical CPUs. This in turn has a significant impact on the overall system power, considering that a system's processor is the largest and most dynamic power consumer of the overall system. The concrete impact of CPU sample power variations is unknown, as comprehensive studies about differences in power consumption for nominally identical systems are currently missing. We address this lack of studies by conducting measurements on four different processor types from two different architectures. For each of these types, we compare up to 30 physical processor samples with a total sum of 90 samples over all processor types. We analyze the variations in power consumption for the different samples using six different workloads over five load levels. Additionally, we analyze how these variations change for different processor core counts and architectures. The results of this paper show that selection of a processor sample can have a statistically significant impact on power consumption. With no correlation to performance, power consumption for nominally identical processors can differ as much as 29.6\% in idle and 19.5\% at full load. We also show that these variations change over different architectures and processor types.}},
title = {{Variations in CPU Power Consumption}},
year = {2016},
booktitle = {Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE 2016)},
location = {Delft, the Netherlands},
month = {March},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Benchmarking, CPU, Power, Variation, SPEC, SERT, Workloads, Energy Efficiency, Metrics, Load level, Utilization},
doi = {http://dx.doi.org/10.1145/2851553.2851567},
slides = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/slides/911},
pdf = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/paper/911.pdf}
}
						
Jóakim von Kistowski, John Beckett, Klaus-Dieter Lange, Hansfried Block, Jeremy A. Arnold, and Samuel Kounev. Energy Efficiency of Hierarchical Server Load Distribution Strategies. In Proceedings of the IEEE 23nd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2015), Atlanta, GA, USA, October 5-7, 2015. IEEE. October 2015.
[ bibtex | abstract | pdf ]
Keywords: SPEC, SERT, Power, Benchmarking, Workload, Energy Efficiency, Metrics, Utilization, Load level.

Energy efficiency of servers has become a significant issue over the last years. Load distribution plays a crucial role in the improvement of energy efficiency as (un-)balancing strategies can be leveraged to distribute load over one or multiple systems in a way in which resources are utilized at high performance, yet low overall power consumption. This can be achieved on multiple levels, from load distribution on single CPU cores to machine level load balancing on distributed systems. With modern day server architectures providing load balancing opportunities at several layers, answering the question of optimal load distribution has become non-trivial. Work has to be distributed hierarchically in a fashion that enables maximum energy efficiency at each level. Current approaches balance load based on generalized assumptions about the energy efficiency of servers. These assumptions are based either on very machine-specific or highly generalized observations that may or may not hold true over a variety of systems and configurations. In this paper, we use a modified version of the SPEC SERT suite to measure the energy efficiency of a variety of hierarchical load distribution strategies on single and multi-node systems. We introduce a new strategy and evaluate energy efficiency for homogeneous and heterogeneous workloads over different hardware configurations. Our results show that the selection of a load distribution strategy depends heavily on workload, system utilization, as well as hardware. Used in conjunction with existing strategies, our new load distribution strategy can reduce a single system's power consumption by up to 10.7%.
@inproceedings{KiBeLaBlArKo2015-MASCOTS,
author = {J\'{o}akim von Kistowski and John Beckett and Klaus-Dieter Lange and Hansfried Block and Jeremy A. Arnold and Samuel Kounev},
abstract = {{Energy efficiency of servers has become a significant issue over the last years. Load distribution plays a crucial role in the improvement of energy efficiency as (un-)balancing strategies can be leveraged to distribute load over one or multiple systems in a way in which resources are utilized at high performance, yet low overall power consumption. This can be achieved on multiple levels, from load distribution on single CPU cores to machine level load balancing on distributed systems. With modern day server architectures providing load balancing opportunities at several layers, answering the question of optimal load distribution has become non-trivial. Work has to be distributed hierarchically in a fashion that enables maximum energy efficiency at each level. Current approaches balance load based on generalized assumptions about the energy efficiency of servers. These assumptions are based either on very machine-specific or highly generalized observations that may or may not hold true over a variety of systems and configurations. In this paper, we use a modified version of the SPEC SERT suite to measure the energy efficiency of a variety of hierarchical load distribution strategies on single and multi-node systems. We introduce a new strategy and evaluate energy efficiency for homogeneous and heterogeneous workloads over different hardware configurations. Our results show that the selection of a load distribution strategy depends heavily on workload, system utilization, as well as hardware. Used in conjunction with existing strategies, our new load distribution strategy can reduce a single system's power consumption by up to 10.7%.}},
title = {{Energy Efficiency of Hierarchical Server Load Distribution Strategies}},
year = {2015},
booktitle = {Proceedings of the IEEE 23nd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2015)},
location = {Atlanta, GA, USA},
month = {October},
day = {5--7},
publisher = {IEEE},
keywords = {SPEC, SERT, Power, Benchmarking, Workload, Energy Efficiency, Metrics, Utilization, Load level},
note = {Full paper acceptance rate: 19\%},
pdf = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/paper/878.pdf},
slides = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/slides/878},
doi = {http://dx.doi.org/10.1109/MASCOTS.2015.11}
}
						
Jóakim von Kistowski, Jeremy A. Arnold, Karl Huppler, Klaus-Dieter Lange, John L. Henning, and Paul Cao. How to Build a Benchmark. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE 2015), Austin, TX, USA, February 2015, ICPE '15. ACM, New York, NY, USA. February 2015.
[ bibtex | abstract | pdf ]
Keywords: SPEC, TPC, SPECpower_ssj2008, SERT, SPEC CPU.

Standardized benchmarks have become widely accepted tools for the comparison of products and evaluation of methodologies. These benchmarks are created by consortia like SPEC and TPC under confidentiality agreements which provide little opportunity for outside observers to get a look at the processes and concerns that are prevalent in benchmark development. This paper introduces the primary concerns of benchmark development from the perspectives of SPEC and TPC committees. We provide a benchmark definition, outline the types of benchmarks, and explain the characteristics of a good benchmark. We focus on the characteristics important for a standardized benchmark, as created by the SPEC and TPC consortia. To this end, we specify the primary criteria to be employed for benchmark design and workload selection. We use multiple standardized benchmarks as examples to demonstrate how these criteria are ensured.
@inproceedings{KiArHuLaHeCa2015-ICPE-Benchmark,
author = {J\'{o}akim von Kistowski and Jeremy A. Arnold and Karl Huppler and Klaus-Dieter Lange and John L. Henning and Paul Cao},
abstract = {{Standardized benchmarks have become widely accepted tools for the comparison of products and evaluation of methodologies. These benchmarks are created by consortia like SPEC and TPC under confidentiality agreements which provide little opportunity for outside observers to get a look at the processes and concerns that are prevalent in benchmark development. This paper introduces the primary concerns of benchmark development from the perspectives of SPEC and TPC committees. We provide a benchmark definition, outline the types of benchmarks, and explain the characteristics of a good benchmark. We focus on the characteristics important for a standardized benchmark, as created by the SPEC and TPC consortia. To this end, we specify the primary criteria to be employed for benchmark design and workload selection. We use multiple standardized benchmarks as examples to demonstrate how these criteria are ensured.}},
title = {{How to Build a Benchmark}},
year = {2015},
booktitle = {Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE 2015)},
location = {Austin, TX, USA},
month = {February},
publisher = {ACM},
series = {ICPE '15},
doi = {http://dx.doi.org/10.1145/2668930.2688819},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {SPEC; TPC; SPECpower\_ssj2008; SERT; SPEC CPU},
pdf = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/paper/773.pdf}
}
						
Jóakim von Kistowski, Hansfried Block, John Beckett, Klaus-Dieter Lange, Jeremy A. Arnold, and Samuel Kounev. Analysis of the Influences on Server Power Consumption and Energy Efficiency for CPU-Intensive Workloads. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE 2015), Austin, TX, USA, February 2015, ICPE '15. ACM, New York, NY, USA. February 2015.
[ bibtex | abstract | pdf ]
Keywords: SPEC, SERT, Power, Workload Characterization, Energy Efficiency, Metrics, Utilization.

Energy efficiency of servers has become a significant research topic over the last years, as server energy consumption varies depending on multiple factors, such as server utilization and workload type. Server energy analysis and estimation must take all relevant factors into account to ensure reliable estimates and conclusions. Thorough system analysis requires benchmarks capable of testing different system resources at different load levels using multiple workload types. Server energy estimation approaches, on the other hand, require knowledge about the interactions of these factors for the creation of accurate power models. Common approaches to energy-aware workload classification classify workloads depending on the resource types used by the different workloads. However, they rarely take into account differences in workloads targeting the same resources. Industrial energy-efficiency benchmarks typically do not evaluate the system's energy consumption at different resource load levels, and they only provide data for system analysis at maximum system load. In this paper, we benchmark multiple server configurations using the CPU worklets included in SPEC's Server Efficiency Rating Tool (SERT). We evaluate the impact of load levels and different CPU workloads on power consumption and energy efficiency. We analyze how functions approximating the measured power consumption differ over multiple server configurations and architectures. We show that workloads targeting the same resource can differ significantly in their power draw and energy efficiency. The power consumption of a given workload type varies depending on utilization, hardware and software configuration. The power consumption of CPU-intensive workloads does not scale uniformly with increased load, nor do hardware or software configuration changes affect it in a uniform manner.
@inproceedings{KiBlBeLaArKo2015-ICPE-SERT,
author = {J\'{o}akim von Kistowski and Hansfried Block and John Beckett and Klaus-Dieter Lange and Jeremy A. Arnold and Samuel Kounev},
abstract = {{ Energy efficiency of servers has become a significant research topic over the last years, as server energy consumption varies depending on multiple factors, such as server utilization and workload type. Server energy analysis and estimation must take all relevant factors into account to ensure reliable estimates and conclusions. Thorough system analysis requires benchmarks capable of testing different system resources at different load levels using multiple workload types. Server energy estimation approaches, on the other hand, require knowledge about the interactions of these factors for the creation of accurate power models. Common approaches to energy-aware workload classification classify workloads depending on the resource types used by the different workloads. However, they rarely take into account differences in workloads targeting the same resources. Industrial energy-efficiency benchmarks typically do not evaluate the system's energy consumption at different resource load levels, and they only provide data for system analysis at maximum system load. In this paper, we benchmark multiple server configurations using the CPU worklets included in SPEC's Server Efficiency Rating Tool (SERT). We evaluate the impact of load levels and different CPU workloads on power consumption and energy efficiency. We analyze how functions approximating the measured power consumption differ over multiple server configurations and architectures. We show that workloads targeting the same resource can differ significantly in their power draw and energy efficiency. The power consumption of a given workload type varies depending on utilization, hardware and software configuration. The power consumption of CPU-intensive workloads does not scale uniformly with increased load, nor do hardware or software configuration changes affect it in a uniform manner.}},
title = {{Analysis of the Influences on Server Power Consumption and Energy Efficiency for CPU-Intensive Workloads}},
year = {2015},
booktitle = {Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE 2015)},
location = {Austin, TX, USA},
month = {February},
publisher = {ACM},
series = {ICPE '15},
doi = {http://dx.doi.org/10.1145/2668930.2688057},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {SPEC, SERT, Power, Workload Characterization, Energy Efficiency, Metrics, Utilization},
pdf = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/paper/772.pdf},
slides = {http://se2.informatik.uni-wuerzburg.de/pa/publications/download/slides/772},
note = {acceptance rate: 27\%}
}
						
Klaus-Dieter Lange, Jeremy A. Arnold, Hansfried Block, Nathan Totura, John Beckett, and Mike G. Tricker. Further Implementation Aspects of the Server Efficiency Rating Tool (SERT). In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering, ICPE '13, New York, NY, USA, 2013. ACM.
[ bibtex | abstract ]
Keywords: Affinitization, Benchmark, Energy Efficiency, Energy Star, Environment Protection Agency (EPA), Framework, Memory, Performance Engineering, Reporting, Server, SPEC, System Discovery, System Performance.

The Server Efficiency Rating Tool (SERT) has been developed by the Standard Performance Evaluation Corporation (SPEC) at the request of the US Environmental Protection Agency (EPA). Almost 3% of all electricity consumed within the US in 2010 went to running datacenters. With this in mind, the EPA released Version 2.0 of the ENERGY STAR for Computer Servers program in early 2013 to include the mandatory use of the SERT. Other governments world-wide that are also concerned by growing power consumption of servers and datacenters are considering the adoption of the SERT.
@inproceedings{Lange:2013:IAS:2479871.2479926,
 author = {Lange, K.-D. and Arnold, Jeremy A. and Block, Hansfried and Totura, Nathan and Beckett, John and Tricker, Mike G.},
 title = {{Further Implementation Aspects of the Server Efficiency Rating Tool (SERT)}},
 booktitle = {Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering},
 series = {ICPE '13},
 year = {2013},
 isbn = {978-1-4503-1636-1},
 location = {Prague, Czech Republic},
 pages = {349--360},
 numpages = {12},
 url = {http://doi.acm.org/10.1145/2479871.2479926},
 doi = {10.1145/2479871.2479926},
 acmid = {2479926},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Affinitization, Benchmark, Energy Efficiency, Energy Star, Environment Protection Agency (EPA), Framework, Memory, Performance Engineering, Reporting, Server, SPEC, System Discovery, System Performance},
} 
						
Klaus-Dieter Lange, Mike G. Tricker, Jeremy A. Arnold, Hansfried Block, and Christian Koopmann. The Implementation of the Server Efficiency Rating Tool. In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, ICPE '12, New York, NY, USA, 2012. ACM.
[ bibtex | abstract ]
Keywords: ENERGY STAR, EPA, SERT, SPEC, Benchmark, Datacenter, Energy Efficiency, Environmental Protection Agency, Power, Rating Tool, Server, Storage.

The Server Efficiency Rating Tool (SERT) has been developed by Standard Performance Evaluation Corporation (SPEC) at the request of the US Environmental Protection Agency (EPA), prompted by concerns that US datacenters consumed almost 3% of all energy in 2010. Since the majority was consumed by servers and their associated heat dissipation systems the EPA launched the ENERGY STAR Computer Server program, focusing on providing projected power consumption information to aid potential server users and purchasers. This program has now been extended to a world-wide audience. This paper expands upon the one published in 2011, which described the initial design and early development phases of the SERT. Since that publication, the SERT has continued to evolve and has entered the first Beta phase in October 2011 with the goal of being released in 2012. This paper describes more of the details of how the SERT is structured. This includes how components interrelate, how the underlying system capabilities are discovered, and how the various hardware subsystems are measured individually using dedicated worklets.
@inproceedings{Lange:2012:ISE:2188286.2188307,
 author = {Lange, Klaus-Dieter and Tricker, Mike G. and Arnold, Jeremy A. and Block, Hansfried and Koopmann, Christian},
 title = {{The Implementation of the Server Efficiency Rating Tool}},
 booktitle = {Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering},
 series = {ICPE '12},
 year = {2012},
 isbn = {978-1-4503-1202-8},
 location = {Boston, Massachusetts, USA},
 pages = {133--144},
 numpages = {12},
 url = {http://doi.acm.org/10.1145/2188286.2188307},
 doi = {10.1145/2188286.2188307},
 acmid = {2188307},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {ENERGY STAR, EPA, SERT, SPEC, Benchmark, Datacenter, Energy Efficiency, Environmental Protection Agency, Power, Rating Tool, Server, Storage},
} 
						
Klaus-Dieter Lange and Mike G. Tricker. The Design and Development of the Server Efficiency Rating Tool (SERT). In Proceedings of the 2nd ACM/SPEC International Conference on Performance Engineering, ICPE '11, New York, NY, USA, 2011. ACM.
[ bibtex | abstract ]
Keywords: SPEC, Benchmark, Energy Efficiency, Power Analysis, Server, Datacenter, Energy Star, Environmental Protection Agency (EPA).

According to the United States Environmental Protection Agency (US EPA) almost 3% of all electricity consumed within the US in 2010 goes to running datacenters, with the majority of that powering servers and the associated air conditioning systems dedicated to eliminating the heat they produce. The EPA launched the ENERGY STAR® Computer Server program in May 2009, intended to deliver information to better enable server purchasing decisions based on projected power consumption. The Server Efficiency Rating Tool (SERT) has been developed by the Standard Performance Evaluation Corporation (SPEC) SPECpower committee to address the EPA requirements for Version 2 of the ENERGY STAR server program. Unlike many tools sourced from the SPEC organization the SERT is not intended to be a benchmark, and for Version 2 does not offer a single score model. Instead it produces detailed information regarding the influence of CPU, memory, network and storage I/O configurations on overall server power consumption. This paper describes the design and development of the SERT, including discussion of the collaborative nature of working with the EPA and the various industry stakeholders involved in the design, review and development process. Many of the core ideas behind SERT were derived from theSPECpower_ssj2008 and other SPEC-developed benchmarks, and this paper illustrates where ideas and code were shared, as well as where new thinking resulted in entirely new solutions. It also includes thoughts for the future, as the ENERGY STAR server program continues to evolve and the SERT will evolve with it.
@inproceedings{Lange:2011:DDS:1958746.1958769,
 author = {Lange, K.-D. and Tricker, Michael G.},
 title = {{The Design and Development of the Server Efficiency Rating Tool (SERT)}},
 booktitle = {Proceedings of the 2Nd ACM/SPEC International Conference on Performance Engineering},
 series = {ICPE '11},
 year = {2011},
 isbn = {978-1-4503-0519-8},
 location = {Karlsruhe, Germany},
 pages = {145--150},
 numpages = {6},
 url = {http://doi.acm.org/10.1145/1958746.1958769},
 doi = {10.1145/1958746.1958769},
 acmid = {1958769},
 publisher = {ACM},
 address = {New York, NY, USA},
}
						
Klaus-Dieter Lange. Identifying Shades of Green: The SPECpower Benchmarks. Computer, 42(3):95-97, March 2009.
[ bibtex | abstract ]
Keywords: Benchmark ,Performance Evaluation, Energy Efficiency, Industry-Standard SPECpower Benchmark, Measurement Standards, Power measurement, SPECpower_ssj2008, Green IT.

To drive energy efficiency initiatives, SPEC established SPECpower_ssj2008, the first industry-standard benchmark for measuring power and performance characteristics of computer systems.
@ARTICLE{4803904, 
author={Lange, K.-D.}, 
journal={Computer}, 
title={{Identifying Shades of Green: The SPECpower Benchmarks}}, 
year={2009}, 
month={March}, 
volume={42}, 
number={3}, 
pages={95-97}, 
keywords={benchmark testing;performance evaluation;computer system performance measurement;computer system power measurement;energy-efficiency initiative;industry-standard SPECpower benchmark;Benchmark testing;Computer industry;Energy efficiency;Energy management;Energy measurement;Measurement standards;Personal communication networks;Power engineering computing;Power measurement;Proposals;SPECpower_ssj2008;benchmarks;green IT}, 
doi={10.1109/MC.2009.84}, 
ISSN={0018-9162},
}