Finding new RESERVOIRs of Resources: Europe’s Largest Grid Project Moves Closer to Cloud-style Computing

The integration of two clans of computation, ‘grid’ and ‘cloud’ computing, is moving closer through collaboration between the projects Enabling Grids for E-sciencE (EGEE) and  Resources and Services Virtualisation without Barriers (RESERVOIR).

The two teams will work together to explore how the institutes providing computing resources to EGEE could benefit from adopting a ‘private cloud’ model to provide resources. Private clouds allow organisations to easily manage their own hardware resources in-house. Using virtualisation technology they can alter the provided computing to suit the work at hand. This makes it easier for them to provide the necessary infrastructure for their users, even if these needs change rapidly over time.

This collaboration will identify how the combination of RESERVOIR’s management software and existing virtualisation technology could offer new ways for EGEE to maximise the use of the resources provided to its user communities. In the future this approach could help sites to increase their resources by using commercial cloud providers during peak loads.

“Throughout EGEE, our partners invest considerable funds on the purchase and management of computing clusters” said Steven Newhouse, EGEE’s Technical Director. “This partnership with the RESERVOIR project will allow us to explore how their software could give EGEE’s resource centres greater flexibility in how they deliver their services to our worldwide grid infrastructure.”

EGEE currently provides resources to many scientific domains, each of these domains has different computational requirements and application environments. RESERVOIR offers the ability for EGEE sites to easily meet the changing needs of the users, from scaling-up services to meet peak loads and improving redundancy, to changing the resources provided to run particular applications. The RESERVOIR virtualisation manager builds on the open source project OpenNebula which has been developed at the Distributed Systems Architecture Research Group at Universidad Complutense de Madrid. The group’s aim is to make management of cloud resources easier using virtual machine technology.

“This partnership between the largest European Grid project and the flagship European research initiative in cloud computing is a natural step given the many benefits of virtualisation on Grid computing” said Ignacio M. Llorente, leader of the RESERVOIR Activity on VM Management and co-leader of OpenNebula, “This is only the beginning, I think that Grid and Cloud computing will coexist and cooperate at different levels in future e-Infrastructures.”

A short video demonstrating RESERVOIR technology can be seen at GridCast. The video shows the demo “Scaling out EGEE sites on Amazon EC2 with OpenNebula” that won the best demo award in the 4th EGEE User Forum/OGF 25 and OGF Europe’s 2nd International Event.

Source: EGEE Newsletter Summer 2009

High Performance and Grid Computing in the Cloud

The HPCcloud discussion group has been created in order to address the growing interest in High Performance Computing and Grid Computing in the Cloud. The purpose of this group is to present experiences and scenarios by individuals, organizations and projects to illustrate how Cloud computing can enhance the different types of distributed and high performance computing infrastructures in science and engineering. The group covers the following aspects about innovative potential, benefits and challenges of new Cloud technologies and services in High Performance Computing (HPC) and Grid Computing research and business:

  • Cultural, security, political and legal barriers to implementing Cloud provisioning models in HPC and Grid environments
  • Architectures for integration of Cloud technologies and services with HPC and Grid infrastructures
  • Standardization of interactions between HPC and Grid platforms and Cloud infrastructures
  • Limitations of existing Cloud services and technologies for the capability and capacity computing demands of the HPC and Grid communities in the execution of both tightly-coupled HPC and loosely-coupled HTC applications
  • HPC Clouds offering platforms with HPC devices and configurations, and Scientific Clouds offering specific services for the scientific and technical computing community
  • Impact of virtualization on the performance of memory, CPU and I/O intensive, and latency sensitive applications, and virtualization support for specialized communication transports
  • Service and infrastructure scalability and elasticity management for the efficient execution of virtualized HPC and Grid platforms
  • Challenges of porting HPC applications to the Cloud and new computing paradigms for HPC on Cloud

You are invited to use this group to promote your events related to HPC and/or Grid Computing in the Cloud.

Relevant links

Ignacio M. Llorente

Wiley Book on Cloud Computing: Final Call for Chapters

The deadline to submit chapter proposals for the book “Cloud Computing: Principles and Paradigms” is May 30th 2009. The call for chapters is available at http://www.manjrasoft.com/CloudBook/

If you require more details, you can contact the editors or the editorial advisory board members:

Primary Editor – Contact Person:

Dr. Rajkumar Buyya
CEO, Manjrasoft, Melbourne, Australia: http://www.manjrasoft.com
Director, Grid Computing and Distributed Systems Laboratory
Dept. of Computer Science and Software Engineering
The University of Melbourne, Australia
Email: raj@csse.unimelb.edu.au

Co-Editors:
Dr. James Broberg
Grid Computing and Distributed Systems Laboratory
Dept. of Computer Science and Software Engineering
The University of Melbourne, Australia
Email: brobergj@csse.unimelb.edu.au

Prof. Andrzej M. Goscinski
School of Information Technology
Deakin University, Geelong, Australia
Email:  ang@deakin.edu.au

Editorial Advisory Board

  • Dr. Geng Li, CISCO Systems, USA
  • Prof. Manish Parashar, Rutgers: The State Univ. of New Jersey, USA
  • Dr. Wolfgang Gentzsch,  Max-Planck-Gesellschaft, München, Germany
  • Prof. Omer Rana, Cardiff University, UK
  • Prof. Hai Jin, Huazhong University of Science and Technology, China
  • Dr. Simon See, Sun Microsystems, Singapore
  • Dr. Greg Pfister, Distinguished Engineer, IBM, USA (retd.)
  • Prof. Ignacio M. Llorente, Universidad Complutense de Madrid, Spain
  • Prof. Geoffrey Fox, Indiana University, USA
  • Dr. Walfredo Cirne, Google, USA

Ignacio M. Llorente

The Infrastructure Quadrant

The Cloud Computing movement is a melting pot of distributed technologies and paradigms that produce new terms at an incredible fast pace. So it is usually difficult for newcomers to figure out how to take advantage of these new technologies or if they fit in their current IT infrastructure at all.

So, let us try to classify the current infrastructure provisioning trends (the IaaS brand of Cloud Computing) using two simple parameters: where do you obtain the resources for your applications (locally or remotely), and how are those resources obtained (physical or virtualized):

  • Own site (Local - Physical): This is the classical provisioning scheme that we’ve been using for years, one service one machine; not much to say here.
  • Grid (Remote - Physical): the resources are obtained from a remote site for an specific service, eg. batch job processing in scientific Grids, or web applications in a typical hosting scenario. In this case you get fixed configurations with a limited control over the remote resources
  • Private Cloud (Local - Virtual): the resources are obtained from your own infrastructure in the form of Virtual Machines. So you can obtained the classical benefits of virtualization (e.g. consolidation, isolation, easy replication of configurations…) but for your infrastructure as a whole and not just for one server.
  • The Cloud (Remote - Virtual): the resources are obtained from an external (cloud) provider. Unlike the Grid and thanks to VMs, you have total control of the resources you are “buying”, you can install what you need. Usually the provider in this case is another company, but it could be a partner, in that case it is called a Community Cloud.

Let’s briefly review three resource provisioning examples in use:

Classical IT Outsourcing (Own site + Grid). This is the well accepted provisioning scheme adopted by many companies. Some of the core services are hosted in the in-house infrastructure and others are moved to an external hosting. Usually research centers use this model to store and analyze big amounts of data, such as those generated by LHC, or to solve grand challenge applications
Cloud Outsourcing (Own site + Cloud). Similar as above but you get VM’s instead of pre-configured environments to support your service workload. In this case, the VMs can be configured to register to the local services(e.g. a clustered web server), so the capacity assigned to the service can grow with its demands.
The Hybrid Cloud (Private Cloud + Cloud). Nowadays the use of Virtual Machines is a common practice, for example to easily get developing and testing environments. This model can be combined with a Cloud if the some of VMs are obtained from a remote provider, typically to satisfy peak demands.

Probably, using this quadrant you can better plan the resource provision strategy for your site or understand what they are trying to sell to you!

Ruben S. Montero

Interfaces for Private and Public Cloud Computing

An entire ecosystem is evolving around cloud computing. Interface standardization efforts, commercial products, cloud infrastructure and management services, virtual appliance providers and open-source solutions are filling niches in the cloud ecosystem. The role and position of a component or a service in the ecosystem are defined by its capabilities, the consumers of those capabilities and its relationship with other components and services.

This article presents public and private cloud computing from the perspective of their different application scope and interfaces.

Interfaces for Public Cloud Computing

Public or external clouds offer virtualized resources as a service, enabling the deployment of an entire IT infrastructure without the associated capital costs, paying only for the used capacity. Amazon EC2, ElasticHosts, GoGrid and FlexiScale are examples of commercial cloud providers of elastic capacity, offering a public interface for remote management of virtualized server instances within their proprietary infrastructure. With the growing popularity of these cloud offerings, an ecosystem of tools is emerging that can be used to transform an organization’s existing infrastructure into a public cloud. Technologies, such as Globus Nimbus or Eucalyptus, provide an open-source implementation of cloud-like public interfaces, and projects, such as RESERVOIR, are developing open-source toolkits for building any cloud architecture.

The standardization of a public cloud interface is the aim of the OGF Open Cloud Computing Interface Working Group. OCCI-WG is delivering an API specification for remote management of cloud computing infrastructure, allowing for the development of interoperable tools for common tasks on public clouds including deployment, autonomic scaling and monitoring. Main consumers of this API would be service management platforms, technologies for building hybrid clouds, or service providers. The working group keeps a complete list of existing cloud APIs and a list of references to studies comparing the APIs. The requirements for the new specification are being extracted from a collection of use cases contributed by the community. The working group is being supported by relevant companies and open-source initiatives in the cloud computing ecosytem.

Interoperability is not only about standardization of interfaces, but also about portability of virtual machines. The DMTF Open Virtualization Format (OVF) can be used as a means for customers of an IaaS provider to express their infrastructural needs. OVF was not designed with cloud computing in mind, so there are issues that need to be solved when applied to this environment, in particular, on automatic elasticity, self-configuration and deployment constraints. In any case, standards for cloud interoperability (OCCI) and virtual machine portability (OVF) are imminent and many providers are planning to adopt them.

Interfaces for Private Cloud Computing

On the other hand, there is a growing interest in tools for leasing compute capacity from the local infrastructure. The aim of these deployments is not to expose to the world a cloud interface to sell capacity over the Internet, but to provide local users with a flexible and agile private infrastructure to run service workloads within the administrative domain. This private or enterprise cloud model is not new, since datacenter management has been around for a while. In fact, I would venture that  future datacenters will look like private clouds.  Platform VM Orchestrator, VMware VSphere, Citrix Cloud Center, and Red Hat Enterprise Virtualization Manager are commercial tools for managament of virtualized services on the datacenter, so aimed at building private clouds. OpenNebula Virtual Infrastructure Engine (now part of Ubuntu) is an open-source alternative for private cloud computing, also supporting hybrid cloud deployments to supplement local infrastructure with computing capacity from an external cloud.

Private cloud interfaces should so allow the integration of the virtualized distributed infrastructure in the data-center management stack, including user and administration support. A private cloud interface should provide rich enough semantics, far beyond of that provided by public clouds, to ease this integration. Such interface should provide additional functionality for virtualization, networking, image and physical resource configuration, management, monitoring and accounting, not exposed by pubic cloud interfaces.

The standardization of a private cloud interface may be the aim of the new DMTF Cloud Computing Incubator, given that, according to its charter, one of its benefits is to enable the use of cloud computing within enterprises. The DMTF Open Cloud Standards Incubator Leadership Board currently includes most of main providers and integrators of private cloud solutions. On the other hand, although conceived as a library to interface with different virtualization technologies, the libvirt virtualization API could be also used as interface for private cloud computing. This is the approach represented by the libvirt implementation of OpenNebula. The implementation of libvirt on top of a virtual infrastructure manager provides an abstraction of a whole cluster of resources (each one with its hypervisor), so a whole cluster can be managed as any other libvirt node.

About Using Public Interfaces for Private Cloud Deployments

The usage of public cloud interfaces to access the local infrastructure would reduce the cost of learning a new interface when moving from a private to a public; but at the expense of providing local users with limited functionality, losing the comfort and control of data center operations, and using, within the administration domain, communication protocols and security mechanisms originally created for remote management. Moreover, several local cloud technologies support cloudbursting to build hybrid clouds, so combining local infrastructure with public cloud-based infrastructure and enabling highly scalable hosting environments.

That does not mean, of course, that you can not expose a public interface on top of your private cloud solution. For example if you want to provide partners or external users with access to your infrastructure, or to sell your overcapacity. Obviously, a local cloud solution is the natural back-end for any public cloud.

Ignacio M. Llorente

New Standardization Working Group on Cloud Computing Interface

After the successful BoF session on Cloud Computing API that we organized at Open Grid Forum 25 to define the charter for a new Working Group to deliver a standard API for “IaaS” clouds, we are happy to announce that the Open Grid Forum (OGF) has officially launched the Open Cloud Computing Interface Working Group (OCCI-WG).

The OGF Open Cloud Computing Interface (OCCI) working group will deliver an API specification for remote management of cloud computing infrastructure, allowing for the development of interoperable tools for common tasks including deployment, autonomic scaling and monitoring. The scope of the specification will be all high level functionality required for the life-cycle management of virtual machines (or workloads) running on virtualization technologies (or containers) supporting service elasticity.

The new API for interfacing “IaaS” Cloud computing facilities will allow for:

  • Consumers to interact with cloud computing infrastructure on an ad-hoc basis (e.g. deploy, start, stop, restart)
  • Integrators to offer advanced management services
  • Aggregators to offer a single common interface to multiple providers
  • Providers to offer a standard interface that is compatible with available tools
  • Vendors of grids/clouds to offer standard interfaces for dynamically scalable service delivery in their products

The OCCI working group invites your participation. Subscribe to the mailing list, and call into the teleconference call.

OGF OCCI-WG is being coordinated by Thijs Metsch (Sun Microsystems), Ignacio M. Llorente (DSA-research/UCM and OpenNebula), Alexis Richardson (Rabbit Technologies and CohesiveFT), and Sam Johnston (Australian Online Solutions).

Ignacio M. Llorente

Building Private and Hybrid Clouds with Ubuntu 9.04

Ubuntu 9.04 (Jaunty Jackalope) has been released today bringing highly interesting new features, specially in the Cloud Computing and Virtualization area. The new Ubuntu server distribution includes two complementary cloud tools, OpenNebula and Eucalyptus, so providing the technology required to build the three types of Cloud architectures, namely private, hybrid and public clouds.

Eucalyptus can be used to transform an existing infrastructure into an IaaS public cloud, being compatible with Amazon’s EC2 interface. Eucalyptus is fully functional with respect to providing cloud-like interfaces and higher-level cloud functionality for security, contextualization and image management. OpenNebula, on the other hand, is a virtual infrastructure engine that enables the dynamic and scalable deployment and re-placement of groups of interconnected virtual machines within and across sites. OpenNebula can be primarily used as a virtualization tool to manage a distributed virtual infrastructure in the datacenter or cluster. This application is usually referred as private cloud, and  OpenNebula can also dynamically scale the local infrastructure using external clouds, so building hybrid clouds. OpenNebula provides dynamic “cloudbursting” to any cloud with Amazon EC2 interfaces, including Eucalyptus-based clouds.

OpenNebula is building an ecosytem with tools extending its functionality, such as the Haizea lease management system, a libvirt implementation on top of OpenNebula or a VM consolidation scheduler fro GreenIT. The project provides support to host the development of the new ecosystem projects.

Moreover, because OpenNebula is one of the technologies being enhanced in RESERVOIR, flagship European research initiative in virtualized infrastructures and cloud computing, in few months there will be available several new components complementing its functionality for service elasticity management, VM placement to meet SLA commitments, supporting public cloud interfaces…

Ignacio M. Llorente

HotCloud ‘09 Call for Papers

The first edition of the Workshop on Hot Topics in Cloud Computing (HotCloud ‘09) will be co-located with the 2009 USENIX Annual Technical Conference (USENIX ‘09), which will take place June 14–19, 2009, in San Diego. The paper submission deadline is April 15th, 2009.

Ignacio Martín Llorente

GridWay Project Ideas at Google Summer of Code 2009

This year, our Research Group is glad to participate in the Google Summer of Code.

GSoC2009
At this very moment, two project ideas inspired in the GridWay Metascheduler are being offered to worldwide students:

  • GridWay + Google Maps Mashup: GridWay users retrieve information about the Grid resources they have access to and their jobs using the commands gwhost and gwps. Even if the given information is complete, a Command Line Interface doesn’t provide the user the big picture of what’s going on. If a picture is worth a thousand words, a GoogleMap will be worth a million. The objective is to provide a wrapper to these commands and use the GoogleMaps API to represent geographically the Grid activity (More Info).
  • Develop a GUI for GridWay: GridWay provides its users with a complete and powerful set of commands for job submission and monitoring. However, not all users are made for a Command Line Interface, so GridWay is asking for a neat Graphical User Interface. With this GUI, users should be able to compose and manage their jobs. There isn’t a preferred technology for this implementation (GTK+, Qt, Java, …) so a reasoned choice would be accepted (More Info).

Student application period opens on March 23rd and closes on April 3rd. Check the timeline for more information about important dates!

José Luis Vázquez-Poletti

Alejandro Lorca

Presenting the new GridWay adapter for CREAM infrastructure

In previous entries it was stated the importance of the interoperability in the Grid computing and the use of adapters to obtain it. In this sense, GridWay provides a set of adapters based on Middleware Access Drivers (MADs), which allows the interoperation of three important infrastructures namely, EGEE (preWS gLite-based), TeraGrid and OSG (both Globus-based). The WS-based CREAM service provides a set of job management operations via the Web Service Description Language (WSDL). This new service is incompatible with the LCG pre-WS infrastructure. Therefore, it is necessary the development of a new GridWay adapter, based on MAD drivers, to enable the interoperability between CREAM and the rest of the existing infrastructures.

CREAM Service

The CREAM (Computing Resource Execution And Management) service is a lightweight and simple service that accepts job submission and management operations at the Computer Element level. These requests are described using the Job Description Language (JDL). The JDL language is a user-oriented and high-level language based on Condor classified advertisements (classads) for describing jobs to be submitted to the CREAM CE service.
In order to execute the job operations in CREAM infrastructure it is necessary to delegate the user proxy credential. This credential is used when the operations requiring security support have to be performed by the job. To delegate the proxy credential the user must execute the commands glite-ce-delegate-proxy, in the case of new delegations, or glite-ce-proxy-renew, to renew an existing delegation.
CREAM provides a CLI and C++ API to submit and manage the CREAM jobs. We have used the C++ API to develop a new adapter that allows the submission, cancel and polling of CREAM jobs using GridWay metascheduler.

CREAM MAD for GridWay Metascheduler

The new GridWay execution adapter for CREAM (denominated gw_em_mad_cream) allows the execution of jobs in EGEE infrastructure with CREAM-based CEs. Therefore, using a single GridWay instance with the appropriate adapter we can interoperate with LCG pre-WS and CREAM infrastructures. The figure below shows an example of GridWay interoperation between LCG pre-WS and CREAM.

GridWay provides a wrapper module that includes all needed information to execute a job. Due to the CREAM nodes are not home-shared, the GridWay wrapper is the responsible for transferring the executable input and output files. These transfers are done via the GASS (Global Access to Secondary Storage) server, installed in the User Interface (UI) node, and the globus-url-copy command. However, we found some difficulties when we want to transfer the wrapper input and output files using the CREAM JDL Template. The CREAM Job Description Language specifies that the staging-in must be done by the user and before the job (in this case is the wrapper module) execution. Therefore, the CREAM MAD must include the sentences globus-url-copy needed to transfer the wrapper input files via the GridFTP server installed in the CE (Computer Element) node. On the other hand, the staging-out is automatically done by CREAM using a GridFTP server that must be installed in the destination node. However, the standard UI specifications do not include the installation of a GridFTP server. In order to solve this restriction, CREAM MAD executes an epilogue module, using the JDL language, which transfers the wrapper output files via the GASS server installed in all standard UI nodes.

Finally, it is important to remark that the current implementation of CREAM MAD obtains the job status in synchronous way. Therefore, GridWay is the responsible to check the job status sending a poll request to this MAD. However, CREAM provides a way to obtain the asynchronous job status using the ICE  (a gSOAP/C++ intermediate layer) and the CEMon service. Hence, we will use the CEMon service in the next CREAM adapter release to obtain the asynchronous CREAM job status via the GridWay metascheduler.

Simple CREAM job execution example

This section shows the behavior of GridWay adapter for CREAM in the execution of simple jobs. First, we configure the GridWay metascheduler to use CREAM adapter including the following sentences in the gwd.conf file:

###############################################
# Example MAD Configuration for CREAM testbed #
###############################################
IM_MAD = mds2_static:gw_im_mad_static:-l etc/im_examples/cream.hosts:dummy:cream
EM_MAD = cream:gw_em_mad_cream::jdl
TM_MAD = dummy:gw_tm_mad_dummy:-g

Next, we execute gwd command and show the execution hosts with gwhost command. In the screen capture below we can see that the CREAM machine is included in the available resources list.

Now, to demonstrate the GridWay interoperability we execute two jobs in the CREAM and the LCG infrastructure using the GW Job Template and the command gwsubmit. The GridWay Job Templates used during these executions are:

###############################################
# CREAM JOB: GridWay Job Template             #
###############################################

EXECUTABLE   = exp2exec.sh
ARGUMENTS    = hello test4.jdl
INPUT_FILES  = hello, test4.jdl
OUTPUT_FILES = exec.${JOB_ID}, txt.${JOB_ID}
STDOUT_FILE  = stdout.${JOB_ID}
STDERR_FILE  = stderr.${JOB_ID}
REQUIREMENTS = HOSTNAME = “cream-12.pd.infn.it”

###############################################
# LCG pre-WS JOB : GridWay Job Template       #
###############################################

EXECUTABLE   = exp2exec.sh
ARGUMENTS    = hello test4.jdl
INPUT_FILES  = hello, test4.jdl
OUTPUT_FILES = exec.${JOB_ID}, txt.${JOB_ID}
STDOUT_FILE  = stdout.${JOB_ID}
STDERR_FILE  = stderr.${JOB_ID}
REQUIREMENTS = HOSTNAME = “gridgate.cs.tcd.ie”

As we can see, the only difference between the two templates is the HOSTNAME field. In the two cases we specify the same executable with the same arguments, input and output files. However, we want GridWay executes each job in a different machine. Therefore, we only have to include the REQUIREMENTS attribute specifying the machine name where we want to execute the job. GridWay employs this attribute to execute the job using the suitable adapter in each case. In this way, we demonstrate that the interoperability with GridWay is transparent from de user point of view.
Finally, we can verify the status of two jobs using the command gwhistory.