Thursday, 23 November 2017

Group Publications

The publications endorsed by SPEC RG are available at the publication page.

Affiliated Publications

In the following, we list a selection of relevant publications by members of the RG Cloud working group, which are not formally endorsed by SPEC.
J. v. Kistowski, N. Herbst, S. Kounev, H. Groenda, C. Stier, and S. Lehrig, Modeling and Extracting Load Intensity Profiles. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 11(4):23:1--23:28, January 2017, ACM, New York, NY, USA. [ bibtex | abstract | pdf ]
Keywords: Load Intensity Variation, Load Profile, Meta-Modeling, Model Extraction, Open Workloads, Transformation.

Today’s system developers and operators face the challenge of creating software systems that make efficient use of dynamically allocated resources under highly variable and dynamic load profiles, while at the same time delivering reliable performance. Autonomic controllers, for example, an advanced autoscaling mechanism in a cloud computing context, can benefit from an abstracted load model as knowledge to reconfigure on time and precisely. Existing workload characterization approaches have limited support to capture variations in the interarrival times of incoming work units over time (i.e., a variable load profile). For example, industrial and scientific benchmarks support constant or stepwise increasing load, or interarrival times defined by statistical distributions or recorded traces. These options show shortcomings either in representative character of load variation patterns or in abstraction and flexibility of their format. In this article, we present the Descartes Load Intensity Model (DLIM) approach addressing these issues. DLIM provides a modeling formalism for describing load intensity variations over time. A DLIM instance is a compact formal description of a load intensity trace. DLIM-based tools provide features for benchmarking, performance, and recorded load intensity trace analysis. As manually obtaining and maintaining DLIM instances becomes time consuming, we contribute three automated extraction methods and devised metrics for comparison and method selection. We discuss how these features are used to enhance system management approaches for adaptations during runtime, and how they are integrated into simulation contexts and enable benchmarking of elastic or adaptive behavior. We show that automatically extracted DLIM instances exhibit an average modeling error of 15.2% over 10 different real-world traces that cover between 2 weeks and 7 months. These results underline DLIM model expressiveness. In terms of accuracy and processing speed, our proposed extraction methods for the descriptive models are comparable to existing time series decomposition methods. Additionally, we illustrate DLIM applicability by outlining approaches of workload modeling in systems engineering that employ or rely on our proposed load intensity modeling formalism.
@article{KiHeKo-TAAS17-ModelExtractLoadProfiles,
  author = {J{\'o}akim von Kistowski and Nikolas Herbst and Samuel Kounev and Henning Groenda and Christian Stier and Sebastian Lehrig},
  title = {{Modeling and Extracting Load Intensity Profiles}},
  journal = {ACM Transactions on Autonomous and Adaptive Systems (TAAS)},
  issue_date = {January 2017},
  volume = {11},
  number = {4},
  month = {January},
  year = {2017},
  issn = {1556-4665},
  pages = {23:1--23:28},
  articleno = {23},
  numpages = {28},
  url = {http://doi.acm.org/10.1145/3019596},
  doi = {10.1145/3019596},
  acmid = {3019596},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {Load intensity variation, load profile, metamodeling, model extraction, open workloads, transformation},
}
						
N. Herbst, S. Kounev, A. Weber, and H. Groenda, BUNGEE: An Elasticity Benchmark for Self-Adaptive IaaS Cloud Environments. In Proceedings of the 10th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2015), Firenze, Italy, May 18-19, 2015. [ bibtex | abstract | pdf ]
Keywords: Infrastructure-as-a-Service, Cloud, Elasticity, Benchmark, Metrics, Measurement.

Today's infrastructure clouds provide resource elasticity (i.e. auto-scaling) mechanisms enabling self-adaptive resource provisioning to reflect variations in the load intensity over time. These mechanisms impact on the application performance, however, their effect in specific situations is hard to quantify and compare. To evaluate the quality of elasticity mechanisms provided by different platforms and configurations, respective metrics and benchmarks are required. Existing metrics for elasticity only consider the time required to provision and deprovision resources or the costs impact of adaptations. Existing benchmarks lack the capability to handle open workloads with realistic load intensity profiles and do not explicitly distinguish between the performance exhibited by the provisioned underlying resources, on the one hand, and the quality of the elasticity mechanisms themselves, on the other hand. In this paper, we propose reliable metrics for quantifying the timing aspects and accuracy of elasticity. Based on these metrics, we propose a novel approach for benchmarking the elasticity of Infrastructure-as-a-Service (IaaS) cloud platforms independent of the performance exhibited by the provisioned underlying resources. We show that the proposed metrics provide consistent ranking of elastic platforms on an ordinal scale. Finally, we present an extensive case study of real-world complexity demonstrating that the proposed approach is applicable in realistic scenarios and can cope with different levels of resource efficiency.
@inproceedings{Herbst2015SEAMS,
	author = {N. Herbst and S. Kounev and A. Weber and H. Groenda},
	title = {{BUNGEE: An Elasticity Benchmark for Self-Adaptive IaaS Cloud Environments}},
	keywords = {Infrastructure-as-a-Service, Cloud, Elasticity, Benchmark, Metrics, Measurement},
	booktitle = {Proceedings of the 10th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2015)},
	day = {18--19},
	location = {Firenze, Italy},
	month = {May},
	year = {2015},
	pdf = {\url{http://se2.informatik.uni-wuerzburg.de/pa/uploads/papers/paper-782.pdf}}
}
						
S. Shen, V. van Beek, and A. Iosup, Statistical Characterization of Business-Critical Workloads Hosted in Cloud Datacenters. In the IEEE/ACM CCGRID 2015 conference, Shenzhen, Guangdong, China, May 4-7, 2015. [ bibtex | abstract | pdf ]
Keywords: Business-critical Workloads, Datacenter Workloads, Cloud Workloads, Workload Characterization, Basic Statistics, Time Patterns, Correlation Study, CPU Workloads, Memory Workloads, Network Workloads, Disk Workloads, Grid Workloads Archive.

Business-critical workloads—web servers, mail servers, app servers, etc.—are increasingly hosted in virtualized datacenters acting as infrastructure-as-a-Service clouds (cloud-datacenters). Understanding how business-critical workloads demand and use resources is key in capacity sizing, in infrastructure operation and testing, and in application performance management. However, relatively little is currently known about these workloads, because the information is complex — large-scale, heterogeneous, shared-clusters — and because datacenter operators remain reluctant to share such information. Moreover, the few operators that have shared data (e.g., Google and several supercomputing centers) have enabled studies in business intelligence (MapReduce), search, and scientific computing (HPC), but not in business-critical workloads. To alleviate this situation, in this work we conduct a comprehensive study of business-critical workloads hosted in cloud datacenters. We collect two large-scale and long-term workload traces corresponding to requested and actually used resources in a distributed datacenter servicing business-critical workloads. We perform an in-depth analysis about workload traces. Our study sheds light into the workload of cloud datacenters hosting business-critical workloads. The results of this work can be used as a basis to develop efficient resource management mechanisms for datacenters. Moreover, the traces we released in this work can be used for workload verification, modeling and for evaluating resource scheduling policies, etc.
@inproceedings{shen2015CCGRID,
	Author = {S. Shen and V. van Beek and A. Iosup},
	Title = {Statistical Characterization of Business-Critical Workloads Hosted in Cloud Datacenters},
	Howpublished = {In the IEEE/ACM CCGRID 2015 conference, Shenzhen, Guangdong, China, May 4-7, 2015},
	Keywords = {Business-critical Workloads, Datacenter Workloads, Cloud Workloads, Workload Characterization, Basic Statistics, Time Patterns, Correlation Study, CPU Workloads, Memory Workloads, Network Workloads, Disk Workloads, Grid Workloads Archive},
	Month = {May},
	Year = {2015},
	Pdf = {\url{http://www.pds.ewi.tudelft.nl/~iosup/business-critical-datacenter-workloads15ccgrid.pdf}}
}
						
A. Alexandrov, E Folkerts, K. Sachs, A. Iosup, V. Markl, and C. Tosun, Benchmarking in the Cloud: What it Should, Can, and Cannot Be. In 4th TPC Technology Conference on Performance Evaluation & Benchmarking (TPCTC 2012), held in conjunction with VLDB, Istanbul, Turkey, Aug 2012. [ bibtex | abstract | pdf ]
Keywords: Benchmarking, Cloud Computing, Performance Evaluation, Concepts, Discussion.

With the increasing adoption of Cloud Computing, we observe an increasing need for Cloud Benchmarks, in order to assess the performance of Cloud infrastructures and software stacks, to assist with provisioning decisions for Cloud users, and to compare Cloud offerings. We understand our paper as one of the first systematic approaches to the topic of Cloud Benchmarks. Our driving principle is that Cloud Benchmarks must consider end-to-end performance and pricing, taking into account that services are delivered over the Internet. This requirement yields new challenges for benchmarking and requires us to revisit existing benchmarking practices in order to adopt them to the Cloud.
@inproceedings{alexandrov2012TPCTC,
	Author = {A. Alexandrov and E. Folkerts and K. Sachs and A. Iosup and V. Markl and C. Tosun},
	Title = {Benchmarking in the Cloud: What it Should, Can, and Cannot Be},
	Howpublished = {In 4th TPC Technology Conference on Performance Evaluation & Benchmarking (TPCTC 2012), held in conjunction with VLDB, Istanbul, Turkey, Aug 2012},
	Keywords = {Benchmarking, Cloud Computing, Performance Evaluation, Concepts, Discussion},
	Month = {Aug},
	Year = {2012},
	Pdf = {\url{http://www.dvs.tu-darmstadt.de/publications/pdf/cloud-benchmarking-what12tpctc-vldb.pdf}}
}
						
N. Herbst, S. Kounev, and R. Reussner. Elasticity in Cloud Computing: What it is, and What it is Not. In Proceedings of the 10th International Conference on Autonomic Computing (ICAC 2013), San Jose, CA, June 24-28. 2013, USENIX. [ bibtex | abstract | pdf | slides | poster ]
Keywords: Benchmarking, Cloud Computing, Elasticity, Performance Evaluation, Concepts, Metrics, Discussion.

Originating from the field of physics and economics, the term elasticity is nowadays heavily used in the context of cloud computing. In this context, elasticity is commonly understood as the ability of a system to automatically provision and de-provision computing resources on demand as workloads change. However, elasticity still lacks a precise definition as well as representative metrics coupled with a benchmarking methodology to enable comparability of systems. Existing definitions of elasticity are largely inconsistent and unspecific leading to confusion in the use of the term and its differentiation from related terms such as scalability and efficiency; the proposed measurement methodologies do not provide means to quantify elasticity without mixing it with efficiency or scalability aspects. In this short paper, we propose a precise definition of elasticity and analyze its core properties and requirements explicitly distinguishing from related terms such as scalability, efficiency, and agility. Furthermore, we present a set of appropriate elasticity metrics and sketch a new elasticity tailored benchmarking methodology addressing the special requirements on workload design and calibration.
@inproceedings{HeKoRe2013-ICAC-Elasticity,
  author = {Nikolas Roman Herbst and Samuel Kounev and Ralf Reussner},
  title = {{Elasticity in Cloud Computing: What it is, and What it is Not}},
  booktitle = {Proceedings of the 10th International Conference on Autonomic Computing
	(ICAC 2013), San Jose, CA, June 24--28},
  year = {2013},
  publisher = usenix,
  note = {Preliminary Version},
  pdf = {http://sdqweb.ipd.kit.edu/publications/pdfs/HeKoRe2013-ICAC-Elasticity.pdf},
  slides = {http://sdqweb.ipd.kit.edu/publications/pdfs/HeKoRe2013-ICAC-Elasticity_Slides.pdf},
  poster = {http://sdqweb.ipd.kit.edu/publications/pdfs/HeKoRe2013-ICAC-Elasticity_Poster.pdf}
}
						
N. Huber, F. Brosig, N. Dingle, K. Joshi, and S. Kounev. Providing Dependability and Performance in the Cloud: Case Studies. In K. Wolter, A. Avritzer, M. Vieira, and A. van Moorsel, editors, Resilience Assessment and Evaluation of Computing Systems, XVIII, ISBN: 978-3-642-29031-2, Springer-Verlag, Berlin, Heidelberg, 2012. [ bibtex | abstract | pdf ]
Keywords: Dependability, Resilience, Cloud Computing.

Cloud Computing promises a variety of opportunities but also brings up several challenges. The three case studies presented in the following are examples on how challenges in the field of capacity management, dependability, and scalability can be addressed and how opportunities of Cloud Computing can be leveraged to, e.g., maintain performance requirements or to increase dependability.
@incollection{HuBrDiJoKo2012-ResBook-CloudCaseStudies,
  author = {Nikolaus Huber and Fabian Brosig and N. Dingle and K. Joshi and Samuel
	Kounev},
  title = {{Providing Dependability and Performance in the Cloud: Case Studies}},
  booktitle = {{Resilience Assessment and Evaluation of Computing Systems}},
  publisher = {Springer-Verlag},
  year = {2012},
  editor = {K. Wolter and A. Avritzer and M. Vieira and A. van Moorsel},
  series = {XVIII},
  address = {Berlin, Heidelberg},
  isbn = {978-3-642-29031-2},
  note = {ISBN: 978-3-642-29031-2},
  url = {http://www.springer.com/computer/communication+networks/book/978-3-642-29031-2}
}
						
A. Iosup, R. Prodan, and D. Epema, IaaS Cloud Benchmarking: Approaches, Challenges, and Experience. In 5th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS 2012), held in conjunction with SC, Salt Lake City, Utah, USA, Nov 2012. [ bibtex | abstract | pdf ]
Keywords: Benchmarking, Challenges, IaaS Clouds, Cloud Computing, Performance Evaluation, Concepts, Discussion.

Infrastructure-as-a-Service (IaaS) cloud computing is an emerging commercial infrastructure paradigm under which clients (users) can lease resources when and for how long needed, under a cost model that reflects the actual usage of resources by the client. For IaaS clouds to become mainstream technology and for current cost models to become more client-friendly, benchmarking and comparing the non-functional system properties of various IaaS clouds is important, especially for the cloud users. In this article we focus on the IaaS cloud-specific elements of benchmarking, from a user’s perspective. We propose a generic approach for IaaS cloud benchmarking, discuss numerous challenges in developing this approach, and summarize our experience towards benchmarking IaaS clouds. We argue for an experimental approach that requires, among others, new techniques for experiment compression, new benchmarking methods that go beyond blackbox and isolated-user testing, new benchmark designs that are domain-specific, and new metric for elasticity and variability.
@inproceedings{iosup2012a,
	Author = {A. Iosup and R. Prodan and D. Epema},
	Title = {IaaS Cloud Benchmarking: Approaches, Challenges, and Experience},
	Howpublished = {In 5th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS 2012), held in conjunction with SC, Salt Lake City, Utah, USA},
	Keywords = {Benchmarking, Challenges, IaaS Clouds, Cloud Computing, Performance Evaluation, Concepts, Discussion},
	Month = {Nov},
	Year = {2012}
}
						
S. Kounev, P. Reinecke, F. Brosig, J. T. Bradley, K. Joshi, V. Babka, A. Stefanek, and S. Gilmore. Providing Dependability and Resilience in the Cloud: Challenges and Opportunities. In K. Wolter, A. Avritzer, M. Vieira, and A. van Moorsel, editors, Resilience Assessment and Evaluation of Computing Systems, XVIII, ISBN: 978-3-642-29031-2, Springer-Verlag, Berlin, Heidelberg, 2012. [ bibtex | abstract | pdf ]
Keywords: Dependability, Resilience, Cloud Computing.

Cloud Computing is a novel paradigm for providing data center resources as on demand services in a pay-as-you-go manner. It promises significant cost savings by making it possible to consolidate workloads and share infrastructure resources among multiple applications resulting in higher cost- and energy-efficiency. However, these benefits come at the cost of increased system complexity and dynamicity posing new challenges in providing service dependability and resilience for applications running in a Cloud environment. At the same time, the virtualization of physical resources, inherent in Cloud Computing, provides new opportunities for novel dependability and quality-of-service management techniques that can potentially improve system resilience. In this chapter, we first discuss in detail the challenges and opportunities introduced by the Cloud Computing paradigm. We then provide a review of the state-of-the-art on dependability and resilience management in Cloud environments, and conclude with an overview of emerging research directions.
@incollection{KoReBrBrJoBaStGi2012-ResBook-CloudChallenges,
  author = {Samuel Kounev and Philipp Reinecke and Fabian Brosig and Jeremy T.
	Bradley and Kaustubh Joshi and Vlastimil Babka and Anton Stefanek
	and Stephen Gilmore},
  title = {Providing Dependability and Resilience in the Cloud: Challenges and
	Opportunities},
  booktitle = {Resilience Assessment and Evaluation of Computing Systems},
  publisher = {Springer-Verlag},
  year = {2012},
  editor = {K. Wolter and A. Avritzer and M. Vieira and A. van Moorsel},
  series = {XVIII},
  address = {Berlin, Heidelberg},
  isbn = {978-3-642-29031-2},
  note = {ISBN: 978-3-642-29031-2},
  url = {http://www.springer.com/computer/communication+networks/book/978-3-642-29031-2}
}
						
R. Krebs, C. Momm, and S. Kounev. Metrics and Techniques for Quantifying Performance Isolation in Cloud Environments. In Barbora Buhnova and Antonio Vallecillo, editors, Proceedings of the 8th ACM SIGSOFT International Conference on the Quality of Software Architectures (QoSA 2012), June 25-28, 2012, Bertinoro, Italy, pages 91-100, Bertinoro, Italy, June 2012. ACM Press. [ bibtex | abstract | pdf ]
Keywords: Performance Isolation, Metrics, Cloud Computing.

The cloud computing paradigm enables the provision of costefficient IT-services by leveraging economies of scale and sharing data center resources efficiently among multiple independent applications and customers. However, the sharing of resources leads to possible interference between users and performance problems are one of the major obstacles for potential cloud customers. Consequently, it is one of the primary goals of cloud service providers to have different customers and their hosted applications isolated as much as possible in terms of the performance they observe. To make different offerings, comparable with regards to their performance isolation capabilities, a representative metric is needed to quantify the level of performance isolation in cloud environments. Such a metric should allow to measure externally by running benchmarks from the outside treating the cloud as a black box. In this paper, we propose three different types of novel metrics for quantifying the performance isolation of cloud-based systems and a simulation-based case study applying these metrics in the context of a Softwareas-a-Service (SaaS) scenario where different customers (tenants) share one single application instance. We consider four different approaches to achieve performance isolation and evaluate them based on the proposed metrics. The results demonstrate the effectiveness and practical usability of the proposed metrics in quantifying the performance isolation of cloud environments.
@inproceedings{KrMoKo2012-QoSA-QuantifyingPerfIsoMetrics,
  address = {Bertinoro, Italy},
  author = {Krebs, Rouven and Momm, Christof and Kounev, Samuel},
  booktitle = {Proceedings of the 8th ACM SIGSOFT International Conference on the Quality of Software Architectures (QoSA 2012), June 25--28, 2012, Bertinoro, Italy},
  editor = {Buhnova, Barbora and Vallecillo, Antonio},
  month = {June},
  note = {Acceptance Rate (Full Paper): 25.6\%},
  pages = {91--100},
  pdf = {http://sdqweb.ipd.kit.edu/publications/pdfs/KrMoKo2012-QoSA-QuantifyingPerfIsoMetrics.pdf},
  publisher = {ACM Press},
  title = {{M}etrics and {T}echniques for {Q}uantifying {P}erformance {I}solation in {C}loud {E}nvironments},
  url = {http://qosa.ipd.kit.edu/qosa_2012/},
  year = {2012},
  bdsk-url-1 = {http://qosa.ipd.kit.edu/qosa_2012/}
}
						
P. Rygielski and S. Kounev. Network Virtualization for QoS-Aware Resource Management in Cloud Data Centers: A Survey. PIK - Praxis der Informationsverarbeitung und Kommunikation, 36(1):55-64, February 2013, de Gruyter. [ bibtex | abstract | pdf ]
Keywords: Network Virtualization, QoS-Aware Resource Management, Cloud Data Center, Survey.

The increasing popularity of Cloud Computing is leading to the emergence of large virtualized data centers hosting increasingly complex and dynamic IT systems and services. Over the past decade, the efficient sharing of computational resources through virtualization has been subject to intensive research, while network management in cloud data centers has received less attention. A variety of network-intensive applications require QoS (Quality-of-Service) provisioning, performance isolation and support for flexible and efficient migration of virtual machines. In this paper, we survey existing network virtualization approaches and evaluate the extent to which they can be used as a basis for realizing the mentioned requirements in a cloud data center. More specifically, we identify generic network virtualization techniques, characterize them according to their features related to QoS management and performance isolation, and show how they can be composed together and used as building blocks for complex network virtualization solutions. We then present an overview of selected representative cloud platforms and show how they leverage the generic techniques as a basis for network resource management. Finally, we outline open issues and research challenges in the area of performance modeling and proactive resource management of virtualized data center infrastructures.
@article{RyKo2013,
  author = {Piotr Rygielski and Samuel Kounev},
  title = {{Network Virtualization for QoS-Aware Resource Management in Cloud Data Centers: A Survey}},
  journal = {PIK --- Praxis der Informationsverarbeitung und Kommunikation},
  year = {2013},
  volume = {36},
  pages = {55--64},
  number = {1},
  month = {February},
  bdsk-url-1 = {http://www.degruyter.com/view/j/piko-2013-36-issue-1/pik-2012-0136/pik-2012-0136.xml?format=INT},
  bdsk-url-2 = {http://dx.doi.org/10.1515/pik-2012-0136},
  doi = {http://dx.doi.org/10.1515/pik-2012-0136},
  pdf = {http://sdqweb.ipd.kit.edu/publications/descartes-pdfs/RyKo2013-PIK-NetVirtSurvey.pdf},
  publisher = {de Gruyter},
  url = {http://www.degruyter.com/view/j/piko-2013-36-issue-1/pik-2012-0136/pik-2012-0136.xml?format=INT}
}
						
P. Rygielski, S. Zschaler, and S. Kounev. A Meta-Model for Performance Modeling of Dynamic Virtualized Network Infrastructures. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE'13), Prague, Czech Republic, pages 327-330, New York, NY, USA. ACM. April 21-24, 2013. Work-In-Progress Paper. [ bibtex | abstract | pdf ]
Keywords: Performance Modeling, Data Center Networks, Meta-Modeling.

In this work-in-progress paper, we present a new meta-model designed for the performance modeling of dynamic data center network infrastructures. Our approach models characteristic aspects of Cloud data centers which were not crucial in classical data centers. We present our meta-model and demonstrate its use for performance modeling and analysis through an example, including a transformation into OMNeT++ for performance simulation.
@inproceedings{RyZsKo2013-DNI-meta-model,
  author = {Piotr Rygielski and Steffen Zschaler and Samuel Kounev},
  title = {{A Meta-Model for Performance Modeling of Dynamic Virtualized Network Infrastructures}},
  booktitle = {Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE'13)},
  year = {2013},
  address = {New York, NY, USA},
  month = {April},
  pages = {327--330},
  publisher = {ACM},
  note = {Work-In-Progress Paper},
  day = {21--24},
  location = {Prague, Czech Republic},
  pdf = {http://sdqweb.ipd.kit.edu/publications/pdfs/RyZsKo2013-DNI-meta-model.pdf}
}