01-02-2013, 10:20 AM
Grid Information Services for Distributed Resource Sharing
1Grid Information Services.pdf (Size: 173.19 KB / Downloads: 23)
Abstract
Grid technologies enable large-scale sharing of resources
within formal or informal consortia of individuals
and/or institutions: what are sometimes called virtual organizations.
In these settings, the discovery, characterization,
and monitoring of resources, services, and computations
are challenging problems due to the considerable diversity,
large numbers, dynamic behavior, and geographical
distribution of the entities in which a user might be interested.
Consequently, information services are a vital part
of any Grid software infrastructure, providing fundamental
mechanisms for discovery and monitoring, and hence for
planning and adapting application behavior.
We present here an information services architecture that
addresses performance, security, scalability, and robustness
requirements. Our architecture defines simple low-level enquiry
and registration protocols that make it easy to incorporate
individual entities into various information structures,
such as aggregate directories that support a variety of
different query languages and discovery strategies. These
protocols can also be combined with other Grid protocols
to construct additional higher-level services and capabilities
such as brokering, monitoring, fault detection, and
troubleshooting. Our architecture has been implemented as
MDS-2, which forms part of the Globus Grid toolkit and has
been widely deployed and applied.
Introduction
Grid computing technologies enable wide-spread sharing
and coordinated use of networked resources [15]. Sharing
relationships may be static and long-lived—e.g., among
the major resource centers of a company or university—
or highly dynamic: e.g., among the evolving membership
of a scientific collaboration [17]. In either case, the fact
that users typically have little or no knowledge of the resources
contributed by participants in the “virtual organization”
(VO) poses a significant obstacle to their use. For this
reason, information services designed to support the initial
discovery and ongoing monitoring of the existence and
characteristics of resources, services, computations, and
other entities are a vital part of a Grid system [13].
Such information services find uses in a variety of Grid
scenarios. The following examples illustrate but do not
exhaust the range of applications that rely on information
services, and the variety of information sources and information
access and management methods that are associated
with these applications [13, 30, 36, 11, 34]
Grid Information Service Requirements
The requirements of any Grid based information system
are driven by basic properties of the Grid environment. Information
sources are necessarily distributed and individual
sources are subject to failure. The total of number of information
providers can be large, and both the types of information
sources and the ways in which information is used
can be highly varied. We examine the impact of each of
these properties on information service requirements.
Distribution of Information Providers
One consequence of distribution is that we cannot in general
provide users with accurate information: any information
delivered to a user will necessarily be “old.” Since
all information to which a Grid information service provides
access is, at some timescale, dynamic, the state of
the system component on which information is being provided
may have changed, potentially rendering the information
invalid. Because of the local policy aspect of Grid
environments, it can be expensive if not impossible to delay
changes in distributed system state until the information
has been delivered and processed by remote requestors.
Thus, we require that information producers should explicitly
model the currency and confidence of their information,
for example via timestamps and time-to-live metadata.
This approach allows users and delivery components
to manage data in a manner that is appropriate for its degree
of dynamism. We also require that an information service
transport information as rapidly and efficiently as possible
from producer to consumer.
Grid Information Protocol
A user, or more frequently an aggregate directory or
other program, uses GRIP to obtain information from an information
provider about the entity(s) on which the provider
possesses information.
Because an information provider can possess information
on more than one entity, GRIP supports both discovery
and enquiry. Discovery is supported via a search capability.
For example, consider an information provider that maintains
information on a set of workstations. A broker might
then perform a search on that provider to obtain a set of results
that roughly match a given criteria. From the set of
discovered resources, enquiry can be used to refine the set
of resources upon which a broker may schedule. Enquiry
corresponds to a direct lookup of information: the enquiry
supplies the resource name and the provider returns the resource
description. Subscription (i.e., a request that results
in the subsequent delivery of a sequence of updates) can be
an important enquiry mode, and should be supported
Alternative Directory Protocols
We pointed out above the benefits of using the GRIP data
model, query language, and protocol when constructing aggregate
directories. However, one can certainly define alternatives.
For example, it has been argued [12] that relational
data models and query languages can be useful in Grid settings,
due to their ability to support join operations. (E.g.,
“find me an idle computer that is connected to an idle network.”)
Directories that maintain relational representations
of associated resources and that support SQL or some other
relational query language can of course be constructed in
this framework. Or, we can construct directories that employ
the Condor matchmaking algorithm as a query evaluation
mechanism [23] (e.g., see [38]).
Security
Physical and virtual organizations typically define policies
controlling who can access information about their resources.
Any Grid information service must hence incorporate
security mechanisms so that it can comply with these
policies. Security issues arise with both GRIP and GRRP.
Our security approach is intended to support a wide
range of access control policies. We assume that an information
provider may specify, for each piece of information
that it maintains, the credentials that must be presented to
access that information. These credentials may be identity
credentials, in which case the access control policy is
essentially an access control list, or a capability issued by
some authority, in the case of policies based, for example,
on group membership [27]. GSI public-key security mechanisms
are used to verify credentials and to achieve mutual
authentication between information consumers and information
providers.
Conclusions and Future Work
We have described a Grid information service architecture
that defines simple data models and registration and enquiry
protocols for Grid entities, and supports the creation
of a wide assortment of specialized information services as
well as other high-level, information-intensive services.
Our implementation of this architecture, MDS-2, has
been widely deployed in a number of different configurations
as part of the Globus 1.1.3 software release. We are
currently working to incorporate subscription-based push
methods and more sophisticated access control methods.
We also plan to explore the construction of different and
more specialized types of aggregate directories, investigate
update versus freshness tradeoffs in directory implementation,
explore applications in different settings and domains,
develop flexible configuration tools to enable lightweight
VO formation, and extend our security models to incorporate
capabilities and delegation to enable more sophisticated
directory construction and caching of information provider
values.