There is a growing number of posts and articles trying to show how cloud computing is a new paradigm that supersedes Grid computing by extending its functionality and simplifying its exploitation, even announcing that Grid computing is dead. It seems that new technologies and paradigms have always the mission objective to substitute existing ones. Some of these contributions do not fully understand what grid computing is, focusing their comparative analysis on simplicity of interfaces, implementation details or basic computing aspects. Others posts define Cloud in the same terms as Grid or create a taxonomy which includes Grid and cluster computing technologies.
Grid is as an interoperability technology
, enabling the integration and management of services and resources in a distributed, heterogeneous environment. The technology provides support for the deployment of different kinds of infrastructures joining resources which belong to different administrative domains. In the special case of a Compute Grid infrastructure, such as EGEE
, Grid technology is used to federate computing resources spanning multiple sites for job execution and data processing. There are many success cases demonstrating that Grid technology provides the support required to fulfill the demands of several collaborative scientific and business processes.
Once I have clearly stated my position about Cloud and Grid, let me show how I see Cloud (and virtualization as enabling technology) and Grid as complementary technologies that will coexist and cooperate at different levels of abstraction in future infrastructures.
There will be a Grid on top of the Cloud
Before explaining the role of cloud computing as resource provider for Grid sites, we should understand the benefits of the virtualization of the local infrastructure (Enterprise or Local Cloud?). How can I access on demand to a cloud provider if I have not previously virtualized my local infrastructure?.
Existing virtualization technologies allow a full separation of resource provisioning from service management. A new virtualization layer between the service and the infrastructure layers decouples a server not only from the underlying physical resource but also from its physical location, without requiring any modification within service layers from both the service administrator and the end-user perspectives. Such decoupling is the key to support the scale-out of a infrastructure in order to supplement local resources with cloud resources to satisfy peak or fluctuating demands.
Getting back to the Grid computing case, the virtualization of a Grid site provides several benefits, which overcome many of the technical barriers for Grid adoption:
- Easy support for VO-specific worker nodes
- Reduce gridification cycles
- Dynamic balance of resources between VO’s
- Fault tolerance of key infrastructure components
- Easier deployment and testing of new middleware distributions
- Distribution of pre-configured components
- Cheaper development nodes
- Simplified training machines deployment
- Performance partitioning between local and grid services
- On-demand access to cloud providers
If you are interested in more details about how virtualization and cloud computing can support compute Grid infrastructures you can have a look at my presentation “An Introduction to Virtualization and Cloud Technologies to Support Grid Computing” (EGEE08). I also recommend the report “An EGEE Comparative study: Clouds and grids – evolution or revolution?”.
There exist technology which supports the above use case. The OpenNebula engine enables the dynamic deployment and re-allocation of virtual machines on a pool of physical resources, providing support to access on-demand to Amazon EC2 resources. On the other hand, Globus Nimbus provides a free, open source infrastructure for remote deployment and management of virtual machines, allowing you to create compute clouds.
There will be a Grid under the Cloud
There is a growing interest in the federation of cloud sites. Cloud providers are opening new infrastructure centers at different geographical locations (see IBM or Amazon Availability Zones) and it is clear that no single facility/provider can create a seemingly infinite infrastructure capable of serving massive amounts of users at all times, from all locations. David Wheeler once said, “Any problem in computer science can be solved with another layer of indirection… But that usually will create another problem“, in the same line, federation of cloud sites involves many technological and research challenges, but the good news is that some of them are not new, and have been already studied and solved by the Grid community.
As stated above Grid is not only about computing. Grid is a technology for federation. In the last years, there has been a huge investment in research and development of technological components for sharing of resources across sites. Several middleware components for file transferring, SLA negotiation, QoS, accounting, monitoring… are available, most of them are open-source. As also predicted by Ian Foster in his post “There’s Grid in them thar Clouds”, those will be the components that could enable the federation of cloud sites. On the other hand, other components have to be defined and developed from scratch, mainly those related to the efficient management of virtual machines and services within and across administrative domains. That is exactly the aim of the Reservoir project, the European initiative in Cloud Computing.
In order to conclude this post let me venture some predictions about the coexistence of Grid and Cloud computing in future infrastructures:
- Virtualization, cloud, grid and cluster are complementary technologies that will coexist and cooperate at different levels of abstraction
- Although there are early adopters of virtualization in the Grid/cluster/HPC community, its full potential has not been exploited yet
- In few years, the separation of job management from resource management through a virtualized infrastructure will be a common practice
- Emerging open-source VM managers, such as OpenNebula, will contribute to speed up the adoption
- Grid/cluster/HPC infrastructures will maintain a resource base scaled to meet the average workload demand and will transparently access to cloud providers to meet peak demands
- Grid technology will be used for the federation of clouds
In summary, let’s try to forget about hypes and concentrate on the complementary functionality provided by both paradigms. My message to the user community, the relevant issue is to evaluate which technology meets your requirements. It is unlikely that a single technology will meet all needs. My message to the Grid community, please do not see Cloud as a threat. Virtualization and Cloud are needed to solve many of the technical barriers for wider Grid adoption. My message to the Cloud community, please try to take advantage of the research and development performed by the Grid community in the last decade.
Ignacio Martín Llorente