Today a new book on Cloud Computing and Big Data has been published by IOS Press. I had the pleasure and honor to team up with Dr. Charlie Catlett, Dr. Wolfgang Gentzsch, Prof. Lucio Grandinetti and Prof. Gerhard R. Joubert for its edition.
Cloud computing offers many advantages to researchers and engineers who need access to high performance computing facilities for solving particular compute-intensive and/or large-scale problems, but whose overall high performance computing (HPC) needs do not justify the acquisition and operation of dedicated HPC facilities. There are, however, a number of fundamental problems which must be addressed, such as the limitations imposed by accessibility, security and communication speed, before these advantages can be exploited to the full.
This book presents 14 contributions selected from the International Research Workshop on Advanced High Performance Computing Systems, held in Cetraro, Italy, in June 2012. The papers are arranged in three chapters. Chapter 1 includes five papers on cloud infrastructures, while Chapter 2 discusses cloud applications.
The third chapter in the book deals with big data, which is nothing new – large scientific organizations have been collecting large amounts of data for decades – but what is new is that the focus has now broadened to include sectors such as business analytics, financial analyses, Internet service providers, oil and gas, medicine, automotive and a host of others.
This book will be of interest to all those whose work involves them with aspects of cloud computing and big data applications.
- Title: Cloud Computing and Big Data
- Editors: Catlett, C. , Gentzsch, W., Grandinetti, L., Joubert, G.R., Vazquez-Poletti, J.L.
- Pub. date: October 2013
- Pages: 264
- Volume: 23 of Advances in Parallel Computing
- ISBN: 978-1-61499-321-6
- J.L. Vázquez-Poletti
This month the Journal of Software: Practice and Experience has published online our paper entitled “Autonomic resource contention-aware scheduling”. It can be accessed here.
The complexity of computing systems introduces a few issues and challenges such as poor performance and high energy consumption. In this paper, we first define and model resource contention metric for high performance computing workloads as a performance metric in scheduling algorithms and systems at the highest level of resource management stack to address the main issues in computing systems. Second, we propose a novel autonomic resource contention-aware scheduling approach architected on various layers of the resource management stack. We establish the relationship between distributed resource management layers in order to optimize resource contention metric. The simulation results confirm the novelty of our approach.
This work is the result of a collaboration with Prof. Lucio Grandinetti‘s research group from University of Calabria, Italy.
The Journal of Concurrency and Computation: Practice and Experience has published online our work entitled “A performance/cost model for a CUDA drug discovery application on physical and public cloud infrastructures”, while waiting for its inclusion in a special issue on distributed, parallel, and GPU-accelerated approaches to Computational Biology.
Virtual Screening (VS) methods can considerably aid Drug Discovery research, predicting how ligands interact with drug targets. BINDSURF is an efficient and fast blind VS methodology for the determination of protein binding sites depending on the ligand, that uses the massively parallel architecture of GPUs for fast unbiased pre-screening of large ligand databases. In this contribution, we provide a performance/cost model for the execution of this application on both a physical and public cloud infrastructure. With our model it is possible to determine which is the best infrastructure by means of execution time and costs for any given problem to be solved by BINDSURF. Conclusions obtained from our study can be extrapolated to other GPU based VS methodologies.
This work is the result of a collaboration with a multidisciplinary research group from Catholic University of Murcia (Spain). Also, this is the first paper published by my PhD student Richard M. Wallace with our research group. Congratulations!
I had the honor to be invited by the GridKa School organization to give a plenary talk in this year’s edition, which takes place from 26th to 30th August at Karlsruhe Institute of Technology (KIT).
My talk, entitled “Cloud Computing: Expanding Humanity’s Limits to Planet Mars”, focused in the use of cloud computing for the exploration of Planet Mars.
As another tool that Humanity has used for expanding its limits, cloud computing was born and evolved in consonance with the different challenges where it has been applied.
Due to its seamless provision of resources, dynamism and elasticity, this paradigm has been brought into the spotlight by the Space scientific community and in particular that devoted to the exploration of Planet Mars. This is the case of Space Agencies in need of great amounts of on demand computing resources and with a budget to take care of.
The Red Planet represents the next limit to be reached by Humanity, attracting the attention of many countries as a destination for the next generation manned spaceflights. However, theres is still much research to do on Planet Mars and many computational needs to fulfill.
My talk reviewed the cloud computing approach by NASA and then it focused on the Mars MetNet Mission, with which our research group is actively collaborating. This Mission is being put together by Finland, Russia and Spain, and aims to deploy several tens of weather stations on the Martian surface. The Atmospheric Science research is a crucial area in the exploration of the Red Planet and represents a great opportunity for harnessing and improving current computing tools, and establish interesting collaborations between countries.
The received feedback was very great and some collaboration opportunities have also arisen, making this travel another successful one.
I had the honor to spend the last two weeks of July at Mendoza (Argentina) invited by Universidad Nacional de Cuyo. The reason was that the VI Latin American Symposium on High Performance Computing (HPCLatAm 2013) was taking place.
My job at Mendoza was double:
- The second week I gave a keynote talk on how, since the end of 2009, we have been providing HPC in the cloud solutions for critical applications pertaining to the exploration of planet Mars area. I also provided recent results from the latest application we are working on.
The event has been a total success and collaborations with some Argentinian academic institutions are on the way!
The Journal of Systems and Software will publish our work entitled “Solidifying the foundations of the cloud for the next generation Software Engineering”. Right now it’s “In Press” state but it can be accessed here.
Infrastructure clouds are expected to play an important role in the next generation Software Engineering but currently there are some drawbacks. These clouds are too infrastructure oriented and they lack advanced service oriented capabilities such as service elasticity, quality of service or admission control to perform a holistic management of a whole application. The deployment of complex multi-tier applications on top of IaaS infrastructures requires to provide the IaaS platforms with an extra service layer that provides advanced service management functionality.
In the present contribution we will introduce the benefits of a cloud based service-oriented architecture, which produces a set of research and scientific challenges. Then, current efforts to face these challenges will be described and finally, some conclusions on the work that still needs to be done at the IaaS level will be provided.
One of my Master Thesis student groups has been awarded at the Seventh Edition of CUSL, which stands for “Open-Source University Contest” in Spanish. They were competing against other 84 teams and on 24th May they received the “Best Community Project” prize in a ceremony that took place at the University of Granada.
All the VII CUSL winners. The CygnusCloud team member stand in the second row (left).
The three members of the CygnusCloud team (named in honor of the swam of Complutense University coat of arms) observed that many computational resources of the computer labs spread across the UCM campus were underutilized. On the other hand, computers from our faculty labs are often insufficient to meet the demand.
Turning each campus PC into a Computer Science lab computer would be one way to increase overall computing power, but in reality this isn’t a workable solution given the multitude of software requirements and subsequent administrative overhead this would create.
This project then aims to provide virtual lab machines that can be accessed from any available campus PC in which the both hardware and software requirements are minimal.
An on-demand and centralized distribution of these services like that proposed by CygnusCloud reduces the effects of budget cuts in education as students could use cheaper computers with less energy consumption. The proposed solution increases the academic progress as it optimizes the use of non-specialized computer labs and reduces costs as it relies totally on open source software.
Besides the trophy, my students received a Raspberry Pi development kit. Also, the University of Granada will evaluate CygnusCloud for its integration during next academic year.
Future Generation Computer Systems has published our work entitled “Provisioning Data Analytic Workloads in a Cloud”, which is the result of the collaboration with Prof. Patrick Martin‘s group from Queen’s University.
Data analytics applications are well-suited for a cloud environment. In this paper we examine the problem of provisioning resources in a public cloud to execute data analytic workloads. The goal of our provisioning method is to determine the most cost-effective configuration for a given data analytic workload. Provisioning a workload in a public cloud environment faces several challenges: it is difficult to develop accurate performance prediction models using standard methods; the space of possible configurations is very large so exact solutions cannot be efficiently determined, and the mix and intensity of query classes in a workload vary dynamically over time.
We provide a formulation of the provisioning problem and then define a framework to solve the problem. Our framework contains a cost model to predict the cost of executing a workload on a configuration and a method of selecting configurations. The cost model balances resource costs and penalties from SLAs. The specific resource demands and frequencies are accounted for by queueing network models of the Virtual Machines (VMs), which are used to predict performance. We evaluate our approach experimentally using sample data analytic workloads on Amazon EC2.
You can access the full paper here.
The Workshop MEDIANET 2013 will be held next April 19, 2013 at University Carlos III of Madrid (Leganes Campus, room 4.1.S08 of the Rey Pastor library). The aim of the Workshop is to demonstrate the main results of the project. All talks will be conducted in English, and the event is open to the public, but registration is required by using a web registration form http://goo.gl/cuAvB
||Welcome and Introduction
||Jaime García Reinoso (UC3M)
||Quid Pro Quo: auction mechanisms without payments
||Agustín Santos Méndez (IMDEA Networks)
||Evaluation results of All-Path protocols with OMNET++ simulator. Guidelines for improvement of protocol mechanisms
||Elisa Rojas (UAH)
||Evaluation results of All-Path protocols with flow simulator: results and next steps
||Juan A. Carral (UAH)
||Cloud Computing for on-Demand Provisioning of Resources
||Carlos Martín Sánchez (UCM)
||Cloud Computing Federation and Interoperability
||Daniel Molina Aranda (UCM)
||Multimedia over Content-Centric Networking
||Jaime García Reinoso (UC3M)
||New challenges and paradigms in the Internet for multimedia, Data Centers and Cloud Computing
||Ignacio M. Llorente (UCM), Guillermo Ibáñez (UAH), Jaime García (UC3M),
This week our project held its General Assembly in Madrid, at IMDEA Software headquarters. In fact, we are very happy to announce that this partner is the latest addition to our consortium.
The IMDEA Software Institute is part of IMDEA, the Madrid Institute of Advanced Studies, a network of international research centers in the Madrid region for research of excellence in areas of high economic impact. Its main focus is to perform the research of excellence required to devise methods that will allow the cost-effective development of software products with sophisticated functionality and high quality.
After 3 days of hard work, surely inspired by the huge amount of clouds that invaded Madrid, we have set a very solid roadmap for the next 6 months.