A couple of weeks ago I was invited to present my position on Research Challenges in Cloud Infarstructures in the panel “Beyond Amazon: Using and Offering Services in a Cloud” at Future Internet Assembly, Madrid 2008. The aim of the talk was to present key differentiators between the RESERVOIR project and Amazon EC2, its research challenges in cloud infrastructures and a list of topics for further research. This post elaborates on this presentation in order to describe some of the research challenges that, in my opinion, will be addressed in the new year.
Cloud computing enables the deployment of an entire IT infrastructure without the associated capital costs, paying only for the used capacity. The new “Infrastructure as a Service” paradigm has been introduced to better respond to changing computing demands, so allowing to add and remove capacity in order to meet peak or fluctuating service demands. Amazon Elastic Compute Cloud (Amazon EC2), GoGrid and FlexiScale are examples of cloud providers of elastic capacity, offering an interface for remote management of virtualized server instances within their proprietary infrastructure. These commercial clouds do not provide any detail about the internal management of the virtual machines or the physical infrastructure.
Open source cloud computing tools, such as Eucalyptus and Globus Nimbus, let organizations build and customize there own cloud infrastructure. These relevant tools focus on the client perspective, being fully functional with respect to cloud compatible interfaces and providing higher level functionality for security, contextualization and image management. However, they do not support the dynamic allocation and balance of computing resources among virtual machines to meet the scalable and dynamic computing requirements of enterprise datacenters, such as flexible support for dynamic virtual machine placement and infrastructure management.
The RESERVOIR Project
RESERVOIR is the main European research initiative in virtualized infrastructures and Cloud Computing. RESERVOIR is a joint research programme coordinated by IBM Haifa with 13 European partners: Elsag Datamat, CETIC, OGF.eeig, SAP Research, Sun Microsystems, Telefonica I+D, Thales, Umea University, University College of London, DSA-Research at Universidad Complutense de Marid, University of Lugano and University of Messina. The aim of this project is to develop the open-source technology to enable deployment and management of complex IT services across different administrative domains. Its open-source approach will support the definition of open standards for cloud computing, breaking the lock-in imposed by vendors today and allowing any organization to build its own local or public cloud infrastructure. The first-class management entity is a complex service, as a group of interconnected virtual machines with placement constrains, that can run across different cloud sites, being federation of cloud providers one of its main research challenges.
The cloud infrastructure layer in RESERVOIR is the VEE Management layer, which provides execution of groups of interconnected virtual machines as a service. Its other two main research activities complement this layer to provide service management functionality on top of infrastructure clouds (Service Management Activity coordinated by Telefonica I+D) and to provide virtualization platforms with advance functionality for performance and reallocation optimization (VEE Infrastructure Enablement Activity coordinated by IBM Haifa).
In the context of the VEE Management Activity, coordinated by DSA-esearch at UCM, the project is conducting research in cloud infrastructures to meet the main challenges in the dynamic and scalable management of virtual machines in datacenters, such us the efficient management of groups of virtual machines within and across sites, elasticity support to meet variations in service workload, dynamic placement algorithms, architectures and placement heuristics for federation of sites, and enhanced Cloud interfaces.
Private Cloud Infrastructures
A key component in a cloud infrastructure backend is the distributed virtual infrastructure manager (also called internal cloud or distributed VM Manager), which allows the dynamic placement of virtual machines on a pool of physical resources according to business needs. There is a growing interest in the community in these tools for leasing compute capacity from the local infrastructure (see for example the conclusions of the Cisco Cloud Computing Research Symposium by Ruben S. Montero, co-leader of the OpenNebula project at DSA-research, and the cloud computing predictions for the new year by Randy Bias, VP Technology Strategy at GoGrid). The aim of these deployments is not to expose to the world a cloud interface to sell capacity over the Internet, but to provide a dynamic and flexible private infrastructure to run service workloads.
The OpenNebula VM Manager is a core component in the RESERVOIR VEE Management layer that is being enhanced to meet the demanding requirements of the business use cases in the project. This open-source alternative to commercial tools for VM management provides an efficient, dynamic and scalable management of VMs within datacenters, private clouds, involving a large amount of virtual and physical servers. OpenNebula can interface with a remote cloud site, being the only tool able to access on-demand to Amazon EC2 for dynamic scaling the local infrastructure based on actual usage. Furthermore, the integration of OpenNebula and Haizea provides the only distributed virtual infrastructure management solution offering advance reservation of capacity.
Further Research in Cloud Infrastructures
There are many other topics for further research in cloud infrastructures that will be addressed in 2009:
- Concerning the application of cloud computing, relevant topics are performance and reliability running scientific and business applications in Clouds; content distribution systems using Clouds; and Grid, HPC and data-intensive computing in Clouds.
- Concerning technologies to enable Cloud Computing, interesting topics are new architectures for Cloud infrastructures; Cloud interfaces, programming models and tools; integration with infrastructures for Grid Computing; SLAs, privacy, security and pricing; management of network capacity; heuristics for energy efficiency and high availability; and advance reservation of capacity.
- Concerning federation of Cloud Providers, research topics are interoperability and portability between Cloud providers; open business policies framework for relationships between infrastructure providers; and higher value self-awareness, self-knowledge, and self-management capabilities.
Although there exist several commercial clouds selling computing power, there are many open research issues to build the next generation of cloud infrastructures. These topics are mainly related to new technologies to enable efficient, dynamic and scalable Cloud operation and interoperation.