IEEE Xplore has published the result of one of our latest collaborations with the Institute of Computing Technology from the Chinese Academy of Sciences. This particular work was presented at the 44th International Conference on Parallel Processing (ICPP 2015) that took place in Beijing (China) on September. The paper can be accessed here.
Modern latency-critical online services often rely on composing results from a large number of server components. Hence the tail latency (e.g. The 99th percentile of response time), rather than the average, of these components determines the overall service performance. When hosted on a cloud environment, the components of a service typically co-locate with short batch jobs to increase machine utilization, and share and contend resources such as caches and I/O bandwidths with them.
The highly dynamic nature of batch jobs in terms of their workload types and input sizes causes continuously changing performance interference to individual components, hence leading to their latency variability and high tail latency. However, existing techniques either ignore such fine-grained component latency variability when managing service performance, or rely on executing redundant requests to reduce the tail latency, which adversely deteriorate the service performance when load gets heavier.
In this paper, we propose PCS, a predictive and component-level scheduling framework to reduce tail latency for large-scale, parallel online services. It uses an analytical performance model to simultaneously predict the component latency and the overall service performance on different nodes. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing performance interferences from batch jobs. We demonstrate that, using realistic workloads, the proposed scheduler reduces the component tail latency by an average of 67.05% and the average overall service latency by 64.16% compared with the state-of-the-art techniques on reducing tail latency.
Scientific Programming (Hindawi, JCR:0.559) has just announced the call for papers for a Special Issue on cloud-based simulations and data analysis in which I’m participating as Guest Editor together with Dr. Fabrizio Messina (University of Catania, Catania, Italy) and Dr. Lars Braubach (Nordakademie, Elmshorn, Germany).
In many areas, including commercial as well as scientific fields, the generation and storage of large amounts of data have become essential. Manufacturing and engineering companies use cloud-based high-performance computing technologies and simulation techniques to model, simulate, and predict behavior of complicated models, involving the preliminary analysis of existing data as well the generation of data during the simulations. Having large amounts of data the following question arises, how can it be efficiently processed? Cloud computing holds the promise of providing elastic computational resources thereby adapting towards the concrete application needs and thus is a promising base technology for such processing techniques. But, cloud computing itself is not the complete solution and in order to exploit its underlying power novel algorithms and techniques have to be conceived.
In this special issue we invite original contributions providing novel ideas towards simulation and data processing in the context of cloud computing approaches. The aim of this special issue is to assemble visions, ideas, experiences, and research achievements in these areas.
Potential topics include, but are not limited to:
- Techniques for cloud-based simulations
- Computational Intelligence for cloud-based simulations
- Service composition for cloud-based simulations
- Computational Intelligence for data analysis
- Software architectures for cloud-based simulations
- Cloud-based data mining
- Big data analytics for predictive modeling
Authors can submit their manuscripts via the Manuscript Tracking System at http://mts.hindawi.com/submit/journals/sp/csda/.
Manuscript Due: Friday, 29 April 2016
First Round of Reviews: Friday, 22 July 2016
Publication Date: Friday, 16 September 2016
I’m very happy to announce that I’m serving as Guest Editor at the Computers Journal for a Special Issue on “High Performance Computing for Big Data”. The deadline for submissions is March 31st, 2016.
Big Data is, right now one of the hottest topics in computing research. This is because of:
- the numerous challenges that include (and are not limited to) capture, search, storage, sharing, transfer, representation and privacy of the data;
- and the wide spectrum of areas covered, that range from Bioinformatics to Space Science, and are a research challenge by themselves.
New technologies and algorithms have emerged from Big Data to efficiently manage and process great quantities of data within reasonable elapsed times. However, there are computing barriers that cannot be crossed without the proper resources.
The many ways that High Performance Computing can be delivered for facing Big Data challenges offer a wide spectrum of research opportunities. From FPGAs to cloud computing, technologies and algorithms can be brought to a whole different level and foster incredible insights from massive information repositories.
The papers accepted for publication in this Special Issue cover both fundamental issues and new concepts related to the application of High Performance Computing to the Big Data area.
Since march Future Generation Computer Systems has made available (online) our paper entitled “A multi-dimensional job scheduling”. This work is the result of a collaboration with the research group led by Prof. Lucio Grandinetti (University of Calabria, Italy) and it can be accessed here.
With the advent of new computing technologies, such as cloud computing and contemporary parallel processing systems, the building blocks of computing systems have become multi-dimensional. Traditional scheduling systems based on a single-resource optimization, like processors, fail to provide near optimal solutions. The efficient use of new computing systems depends on the efficient use of several resource dimensions. Thus, the scheduling systems have to fully use all resources. In this paper, we address the problem of multi-resource scheduling via multi-capacity bin-packing. We propose the application of multi-capacity-aware resource scheduling at host selection layer and queuing mechanism layer of a scheduling system. The experimental results demonstrate performance improvements of scheduling in terms of waittime and slowdown metrics.
At the beginning of 2016 the Journal of Computer Physics Communications (Elsevier, JCR:3.122, Q1) will close a Special Issue in which I’m very honored to serve as Guest Editor. You may be interested in the following Call for Papers.
Supercomputers are rapidly evolving as advances in architecture and semiconductor technology. High performance computing has been applied to accelerate the advanced modeling and simulation of materials. The current trend will provide challenges in parallelism because of increased processing units, accelerators, complex hierarchical memory systems, interconnection networks, storage and uncertainties in programming models. The interdisciplinary collaboration is becoming more and more important in high performance computation. Realistic material modeling and simulation need to combine material modeling methods, mathematical models, parallel algorithms and tools for exploiting supercomputers effectively.
Topics include but are not limited to:
- Numerical methods and parallel algorithms for the advanced modeling and simulation of materials
- Use of hardware accelerators (MIC, GPUs, FPGA) and heterogeneous hardware in computational material science
- Mathematical modeling and high performance computing tools in large-scale material simulation
- Programming model for material algorithm scalability and resilience
- Visualization on material data
- Multi-scale modeling and simulation in materials science
- Performance modeling and auto-tuning methods in material simulation
- Big data of materials science
- Accelerate dissipative particle dynamics by hardware accelerators
- Large-scale material modeling based on the new features of message passing programming model
Submission Format and Guideline
All submitted papers must be clearly written in excellent English and contain only original work, which has not been published by or is currently under review for any other journal or conference. Papers must not exceed 25 pages (one-column, at least 11pt fonts) including figures, tables, and references. A detailed submission guideline is available as “Guide to Authors” at: http://www.journals.elsevier.com/computer-physics-communications/
All manuscripts and any supplementary material should be submitted through Elsevier Editorial System (EES). The authors must select as “SI: CPC_HPCME 2015” when they reach the “Article Type” step in the submission process. The EES website is located at: http://ees.elsevier.com/cpc/
All papers will be peer-reviewed by three independent reviewers. Requests for additional information should be addressed to the guest editors.
Editors in Chief
N. Stanley. Scott
Jose Luis Vazquez-Poletti
Submission deadline: 2016.1.20
Acceptance deadline: 2016.4.20
At the end of June one of our recent works in collaboration with the Institute of Computing Technology from the Chinese Academy of Sciences was presented at the 35th IEEE International Conference on Distributed Computing systems (ICDS 2015), which took place at Columbus (Ohio, USA). The paper can be accessed here.
Large-scale interactive services usually divide requests into multiple sub-requests and distribute them to a large number of server components for parallel execution. Hence the tail latency (i.e. the slowest component’s latency) of these components determines the overall service latency. On a cloud platform, each component shares and competes node resources such as caches and I/O bandwidths with its co-located jobs, hence inevitably suffering from their performance interference.
In this paper, we study the short-running jobs in a 12k-node Google cluster to illustrate the dynamic resource demands of these jobs, resulting in both individual components’ latency variability over time and across different nodes and hence posing a major challenge to maintain low tail latency. Given this motivation, this paper introduces a dynamic and interference-aware scheduler for large-scale, parallel cloud services. At each scheduling interval, it collects workload and resource contention information of a running service, and predicts both the component latency on different nodes and the overall service performance. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing workloads and performance interferences. We demonstrate that, using realistic workloads, the proposed approach achieves significant reductions in tail latency compared to the basic approach without scheduling.
Next week I’ll be in Karlsruhe (Germany) for this year’s GridKa School edition. This event takes place in Karlsruhe Institute of Technology since 2003.
If two years ago I gave a talk, this time my participation will be double.
Talk: From Mars to Earth through Cloud Computing
Our society has benefited from Space exploration in many ways. Many of the inventions we use nowadays have their origin in or have been improved by Space research. Computer Science is not an exception.
This talk will introduce the application of Cloud Computing done by me in the context of different Mars missions: Mars MetNet (Spain-Russia-Finland), MSL Curiosity (NASA) and ExoMars2016 (ESA). The achieved know-how allowed the optimization of other areas on Planet Earth, such as weather forecast and agricultural wireless sensor networks processing.
Tutorial: HPCCloud 101, HPC on Cloud Computing for newcomers
Never been into Cloud Computing before? Do you think that an extra computing power is crucial for your research? Do you have some neat parallel codes that your institution doesn’t allow you to execute because the cluster is full? Maybe this tutorial is for you!
The tutorial will cover the following topics:
As Virtual Clusters deployed by StarCluster have Sun Grid Engine and OpenMPI installed you are more than welcome to bring your own codes and give them a try!
This paper describes the GridWay metascheduler and exposes its latest and future developments, mainly related to interoperability and interoperation. GridWay enables large-scale, reliable, and efficient sharing of computing resources over grid middleware. To favor interoperability, it shows a modular architecture based on drivers, which access middleware services for resource discovery and monitoring, job execution and management, and file transfer. This paper presents two new execution drivers for Basic Execution Service (BES) and Computing Resource Execution and Management (CREAM) services and introduces a remote BES interface for GridWay. This interface allows users to access GridWay’s job metascheduling capabilities, using the BES implementation of GridSAM. Thus, GridWay now provides to end users more possibilities of interoperability and interoperation.
More information in the article:
Ismael Marín Carrión, Eduardo Huedo and Ignacio M. Llorente: Interoperating grid infrastructures with the GridWay metascheduler, Concurrency and Computation: Practice and Experience, Volume 27, Issue 9, June 2015, Pages 2278-2290, ISSN 1532-0634, http://dx.doi.org/10.1002/cpe.2971.
Last week the 12th ACM International Conference on Computing Frontiers (CF’15) took place in Ischia (Italy). There our paper entitled “SARP: producing approximate results with small correctness losses for cloud interactive services” was presented. This work is a result of the collaboration with the Institute of Computing Technology from the Chinese Academy of Sciences, which started during my latest research stay there.
Despite the importance of providing fluid responsiveness to user requests for interactive services, such request processing is very resource expensive when dealing with large-scale input data. These often exceed the application owners’ budget when services are deployed on a cloud, in which resources are charged in monetary terms. Providing approximate processing results is a feasible solution for such problem that trades off request correctness (quantified by output quality) for response time reduction. However, existing techniques in this area either use partial input data or skip expensive computations to produce approximate results, thus resulting in large losses in output quality on a tight resource budget.
In this paper, we propose SARP, a Synopsis-based Approximate Request Processing framework to produce approximate results with small correctness losses even using small amount of resources. To achieve this, SARP conducts full computations over the statistical aggregation of the entire input data using two key ideas:
- Offline synopsis management that generates and maintains a set of synopses that represent the statistical aggregation of original input data at different approximation levels.
- Online synopsis selection that considers both the current resource allocation and the workload status so as to select the synopsis with the maximal length that can be processed within the required response time. We demonstrate the effectiveness of our approach by testing the recommendation services in E-commerce sites using a large, real-world dataset.
Using prediction accuracy as the output quality, the results demonstrate:
- SARP achieves significant response time reduction with very small quality losses compared to the exact processing results.
- Using the same processing time, SARP demonstrates a considerable reduction in quality loss compared to existing approximation techniques.
The International Journal of Cloud Applications and Computing has just published our paper entitled “Cost-Effective Resource Configurations for Multi-Tenant Database Systems in Public Clouds”. This work is the result of a collaboration with Prof. Patrick Martin‘s research group (Queen’s University, Canada).
Cloud computing is a promising paradigm for deploying applications due to its large resource offerings on a pay-as-you-go basis. This paper examines the problem of determining the most cost-effective provisioning of a multi-tenant database system as a service over public clouds. The authors formulate the problem of resource provisioning, and then define a framework to solve it. Their framework uses heuristic based algorithms to select cost-effective configurations. The algorithms can optionally balance resource costs against penalties incurred from the violation of Service Level Agreements (SLAs) or opt for non SLA violating configurations. The specific resource demands on the virtual machines for a workload and SLAs are accounted for by the performance and cost models, which are used to predict performance and expected cost respectively. The work validates our approach experimentally using workloads based on standard TPC database benchmarks in the Amazon EC2 cloud.