HPC, Grid and Cloud Computing in Cetraro

I am attending the INTERNATIONAL ADVANCED RESEARCH WORKSHOP ON HIGH PERFORMANCE COMPUTING AND GRIDS in Cetraro (Italy). This is the 9th edition of the workshop organized by Prof. Lucio Grandinetti. I have to say the venue of the workshop, at Grand Hotel San Michele, is just perfect. The panel of speakers includes representatives of the more relevant Grid and HPC research initiatives and technologies around the world. The abstracts of the presentations are available online at the workshop site.

Cloud Computing for on-Demand Resource Provisioning

This is the title of the talk that I gave in the Workshop. The aim of the presentation was to show the benefits of the separation of resource provisioning from job execution management in different deployment scenarios. Within an organization, the incorporation of a new virtualization layer under existing Cluster and HPC middleware stacks decouples the execution of the computing services from the physical infrastructure. The dynamic execution of working nodes, on virtual resources supported by virtual machine managers such as the OpenNEbula Virtual Infrastructure Engine, provides multiple benefits, such as cluster consolidation, cluster partitioning and heterogeneous workload execution. When the computing platform is part of a Grid Infrastructure, this approach additionally provides generic execution support, allowing Grid sites to dynamically adapt to changing VO demands, so overcoming many of the obstacles for Grid adoption.

The previous scenario can be modified so the computing services are executed on a remote virtual infrastructure. This is the resource provision paradigm implemented by some commercial and scientific infrastructure Cloud Computing solutions, such as Globus VWS or Amazon EC2, which provide remote interfaces for control and monitoring of virtual resources. In this way a computing platform could scale out using resources provided on-demand by a provider, so supplementing local physical computing services to satisfy peak or unusual demands. Cloud interfaces can also provide support for the federation of virtualization infrastructures, so allowing virtual machine managers to access resources from remote resources providers or Cloud systems in order to meet fluctuating demands. The OpenNEbula Virtual Infrastructure Engine is being enhanced to access on-demand resources from EC2 and Globus-based clouds. This scenario is being studied in the context of the RESERVOIR– Resources and Services Virtualization without Barriers — EU-funded initiative

Download the slides

Towards a New Model for the Infrastructure Grid

This is the title of my contribution in the Panel “From Grids to Cloud Services”, chaired by Charlie Catlett, in the Workshop. The aim of the presentation was to introduce the discussion on the future of compute grid infrastructures, from infrastructures for the sharing of basic resource services to infrastructures for the sharing of hardware resources. A widely distributed virtual infrastructure, inspired in the federation of cloud systems as providers of virtualized resources (hardware) as a service, would not require end users to learn new interfaces and port their applications to the expected runtime environment. The sharing of resources would be performed at resource level, so local job managers could scale out to partner or commercial clouds, transparently to end users. This new model provides additional benefits, such as the support to any service, seamless integration with any service middleware stack…; at the cost of the virtualization overhead in the execution of the jobs.

It was very interesting to share this position on cloud computing with other researchers from Grid and HPC fields. So the question is: Are the existing compute Grid Infrastructures going to evolve to Grids of Clouds?. In other words, Which model is better for end users and site administrators?, to share basic infrastructure services or the physical infrastructure?.

Download the slides

Ignacio Martín Llorente

3 thoughts on “HPC, Grid and Cloud Computing in Cetraro

  1. Snehal Antani

    This sounds like a very interesting conference. I’d like to hear your thoughts on some of the concepts proposed in one of my technical articles on the subject of high performance grids, what I call “enterprise grids” (which are grids bounded by the “walls” of the data center), and how other technical domains like virtualization and data grids become important. Enterprise grids are far more limited in terms of available resources, and are far more demanding with regard to security, disaster recovery, and management.

    Though the article describes concepts of enterprise grids within the context of a technology product, the concepts themselves are still relevant beyond the commercial domain:

    http://www.ibm.com/developerworks/websphere/techjournal/0804_antani/0804_antani.html

  2. Snehal Antani

    So in response to a blog post that positions Grid and Cloud computing, I’ve expanded on my thoughts in this area. You can read the discussions at: http://www-128.ibm.com/developerworks/forums/thread.jspa?threadID=214794&tstart=0

    The post is as follows: I see “Grid Computing” as the coordinated execution of a complex task across a collection of resources. The burden of the “grid” is to provide control over that execution (initiating the execution, stopping the execution, aggregating results of the execution, reporting the status of the execution, and so on).

    The “grid” should be transparent to the end user. Take SETI@Home for example. The “User” is the researcher that will analyze the results of all of the computations. The SETI@Home platform manages partitioning the data into discrete chunks, dispatching & monitoring the results across the collection of CPU’s spread across the world, and aggregating the results to provide a single view to the “user”.

    Cloud Computing on the other hand provides the virtualized infrastructure upon which the Grid Endpoints will execute. So for example, the “cloud” would provide an “infinite” number of operating system images on which the SETI@Home software would execute. The cloud shouldn’t care about application-specific data, nor should it care about the business logic that is actually executing within a virtualized image. The cloud cares about allocating new images (synonymous to LPAR’s) for applications to run, keeping track of how much physical resources (actual CPU cycles for example) the virtual images consumed, cleaning up the virtual images upon completion, and billing the client for the amount of resources consumed. So with these definitions, going back to my example of SETI@Home, I would argue that this software has both a grid computing component as well as a cloud computing component, where the # of registered computers is part of a pool of hardware resources that already have the SETI@Home grid application containers installed and ready to go), but we should be sure to see 2 separate components and responsibilities: the decision to pick a physical machine to dispatch to, and the grid container that executes the scientific processing.

    To summarize, grid applications and therefore the “Grid Computing” paradigm, which I consider an application architecture and containers for running the business logic, would execute on top of an “infrastructure cloud”, which appears as an infinite # of LPAR’s.

    BTW, we’ve had the ability to run’private clouds’ for 30-40 years – multi-tenancy via S/390 & MVS – and we do it all over the place today. The key difference is that today, w/ Amazon EC2 for example, we can dynamically create and then destroy complete ‘LPARS’ relatively cheaply; whereas in the mainframe and other big iron hardware, LPAR’s tend to be statically defined. In both cases the hardware is virtualized under the covers, some sort of VM/hypervisor contains the operating system image, some type of application server or container executes the business logic, and some type of workload manager assures workload priorities and provides the chargeback.

    I’ve alluded to some of this in my article on ‘enterprise grid and batch computing’: http://www-128.ibm.com/developerworks/websphere/techjournal/0804_antani/0804_antani.html.

    The Parallel Job Manager (described in that article) in WebSphere XD Compute Grid would essentially be the Grid Manager, whose job is to coordinate the execution of complex tasks across the cluster of resources (Grid Execution Environments). Today we don’t discuss the ability to dynamically create new LPAR’s (and therefore call ourselves a cloud computing infrastructure), but you can easily do this with a product like Tivoli Provisioning Manager. Basically, take the bottom image in my article: http://www-128.ibm.com/developerworks/websphere/techjournal/0804_antani/0804_antani.html#xdegc and connect Tivoli Provisioning Manager to the On-Demand Router (part of WebSphere Virtual Enterprise).

Leave a Reply

Your email address will not be published. Required fields are marked *