
This document is also available in non-normative PDF version.
Copyright © 2005 DERI, All Rights Reserved. DERI liability, trademark, document use, and software licensing rules apply.
This deliverable will identify and describe the concept of Triple Space Computing [Fensel D., 2004] (TripleSpace) and find its scope in WSMX [Oren et. al., 2004]. It will specifically concentrate on facilitating asynchronous communication between geographically distributed WSMXs providing a shared RDF space between them. To identify the scope of TripleSpace in WSMX, a brief study of WSMX will be done and relationship between WSMX and TripleSpace will be investigated. Finally, interaction between WSMXs through TripleSpace will be presented.
1. Introduction
1.1 Purpose of this
Document
1.2 Document
Overview
2. Web Service Execution
Environment (WSMX) Overview
3. Tuple Space
Computing
4. Triple Space
Computing
5. Publish
Subscribe Paradigm
6. Integration of
Triple-Space Computing Architecture with WSMX
6.1 A close look in
the WSMX Architecture
6.2 Anatomy of a WSMX
Component
6.3 TripleSpace in
local WSMX Configuration
7. Interaction
Between WSMXs Through TripleSpace
8. Conclusions and
Future Work
References
Acknowledgment
Appendix A
Change-Log
Triple Space Computing (TripleSpace) is a simple yet powerful communication paradigm that inherits the communication model of pervasive communication and projects in the context of Semantic Web Service [Fensel D., 2004]. It is based on the evolution and integration of other known technologies such as Tuple Space Computing [Gelernter, 1992], Persistence Message-based Architecture [Alonso et. al., 1995], Semantic Web and on RDF Schema [Brickley and Guha, 2004]. It provides a persistent shared space for participating applications to enable their interaction without having the need of direct message exchange between them. This is made possible as applications may write, delete, and read triples in this global space.
Web Service Execution Environment (WSMX) enables the dynamic discovery, selection, mediation, invocation, and inter-operation of Semantic Web Services [Oren et. al., 2004]. WSMX in its current state supports synchronous communication with other WSMXs and or systems. While integration process incorporates large number of systems, more than one WSMX will have to be engaged in order to facilitate such integration processes. In addition, one WSMX may require the ability to communicate with other WSMXs in the due course. Such communication will become costly when WSMXs are heavily loaded and their processing time is high. In such cases, TripleSpace provides a medium for asynchronous communication minimizing the cost of interaction between WSMXs. One of the main benefits of asynchronous communication is that, unlike in synchronous communication, participants can come and go without hindering the communication process.
This document describes the possibility of using TripleSpace in WSMX to enable distributed asynchronous communication between WSMXs. This document starts with the definition of TripleSpace and WSMX; finding the possible relationships between them. One of the main focuses of WSMX-TripleSpace is to provide a shared space for WSMX and provide new means of asynchronous communication.
An overview of WSMX is presented in section two. As part of the background information, section three introduces tuple space computing, and describes some previous implementations of this concept. Triple Space Computing is presented in section four of this document. Section five describes the publish-subscribe paradigm while section six highlights the possible integration of TripleSpace and WSMX architecture to enable inter WSMX interaction. Section seven describes the possibility of asynchronous communication between WSMXs through TripleSpace and finally, section eight concludes the document and looks at future works.
The Web Services Execution Environment (WSMX) [Zaremba et al., 2004] is an execution environment for dynamic discovery, selection, mediation and invocation of semantic web services. WSMX builds on the Web Services Modelling Ontology (WSMO) [Roman et al., 2004] that describes various aspects related to semantic web services.
WSMO is based on four concepts: web services, ontologies, goals and mediators. Web services are units of functionality; every web service has exactly one capability, which describes logically what this web service can offer. Every web service has a number of interfaces, which specify how to communicate with it. Goals describe some state that a user may want to achieve. Ontologies are the formal specification of the knowledge domain used by both the web service to express its capability, and by the goal to express the desired world state. Mediators are used to solve different interoperability problems, like differences in ontologies used by a web service and a goal.
WSMX is developed as a reference implementation of an execution environment for web services. WSMX manages a repository of web services, ontologies and mediators. WSMX can achieve a user's goal by dynamically selecting a matching web service, mediating the data that needs to be communicated to this service and invoking it.
In this section we use the term architecture to introduce the abstract software components that make up WSMX. The WSMX Manager and the Execution Engine co-ordinate the activities of WSMX following the execution semantics defined in [Oren, 2004]. All data handled inside WSMX is represented internally as an event with a type and state. The WSMX Core manages the processing of all events passing them to other components in the logic layer as appropriate. This Core could become redundant if the events are stored in the WSMX triplespace once it is implemented.
The main purpose of Service Discovery is to provide functionality on matching of usable SWS with the goals. Selection of the Web Service might happen in WSMX and for that purpose a selection component is used. The Mediator component provides a means of transforming data based on concepts in one ontology to data based on concepts in another ontology. The mapping is based on rules defined between concepts in the source and target ontologies. In the event that data mediation is required and the mediated data is in a non-XML format, the XML Converter can be invoked to translate the results of the Mediator into XML. This is necessary as the web service invocations are via SOAP and the message format for SOAP messages is XML. The Compiler component parses WSML messages received from the WSMO Editor in the User Interface layer, validates the messages against WSMO and then stores the message elements in WSMX repository. The elements compiled to WSMX are the metadata for web services, ontologies, mediators. Once any of these elements have been compiled to WSMX, they are available for use during execution of goals sent to WSMX. The Message Parser parses the WSML Messages containing goals sent to WSMX. The goal is parsed and stored persistently. The functionality of the Message Parser is similar to that of the compiler but there is a conceptual difference. The Message Parser operates on instances of goals while the Compiler operates on the metadata for web services, ontologies, and mediators. Adapters allow applications which can not directly communicate with the interfaces provided by WSMX to communicate with WSMX. The role of the Choreography Engine is to mediate between the requester's and the provider's communication patterns. Reasoner will provide reasoning services for the mapping process of mediation, as well as functionality relating to validation of a possible composition of services, or determination if a composed service in a process is executable in a given context. The Resource Manager is responsible for management of Repositories. These Repositories are used to store the definitions of goals, web services, ontologies and mediators within WSMX. The repositories can be either internal to WSMX or external as, for example, the API to the UDDI described in the WSMO Registry [Herzog et al., 2004]. The figure below shows the WSMX architecture and the position of each component with it.

Figure 2.1: WSMX-Architecture
WSMX is actually based on the concept model of an event architecture. Components subscribe to a core component, WSMX Manager. This manager generates events for the subscribed components. The listener of each of these components can create, update and consume events. The interaction between components is done by events that contain messages. These events are java objects with an internal data structure and an interface specification. The execution semantics of WSMX as described in [Oren et al., 2004] is implemented in this component. Consequently the WSMX Manager is responsible for coordinating the overall operation of a WSMX system.
The event listener components are capable of accepting and processing events picked up by the events scanner from the event repository (in other words: the events scanner queries the event repository looking for "unlocked" events and if it finds any, the broadcasting mechanism of the WSMX manager core distributes this information to each listener of each WSMX component). The event repository is the persistent mechanism where all the events that are processing by the system are stored.
A design based in Service Oriented Architecture and a workflow engine that will allow the definition of multiple execution semantics are the main goal for future revisions of WSMX architecture. This improved architecture will allow for easy to plug and unplug components and to integrate components from different vendors/developers.
Existing publish/subscribe technologies like Tuple Space Computing [Gelernter, 1992], Shared Object Space, Persistent Message-based Architecture [Alonso et al., 1995] can serve as a fruitful basis for our endeavor. To be able to understand and to define TripleSpace, we first need to have a look at these underlying technologies. Therefore, we describe some of these appealing underlying technologies in the subsequent sub-sections.
Linda was developed in parallel programming languages in the early 80s by D. Gelernter [Gelernter, 1992] at Yale University. It provides a communication paradigm between parallel processes called "generative communication model" which differs from other communication models (i.e., monitors, message passing and message queueing) developed in distributed computing. In this paradigm the results of a computer's process or the processes themselves are added as messages in tuple-structured form to the computation environment, where they can be accessed as named, independent entities until some process chooses to receive them. The "Tuple Space" (TS) for process creation and co-ordination is a form of shared "associative memory" for tuples where a tuple, the primary unit of communication in Linda, is an ordered sequence of typed values. The TS shared memory may be distributed or may be served from a single server to which remote processes may connect.
Tuples are distinguished by tags (i.e., symbolic names) and by the types and values of their fields. There are two types of tuples: active tuples and passive tuples. Active tuples are basically processes or task-descriptions used for process creation. They contain functions and elements (i.e., formal parameters and actual parameters) that need to be evaluated by the Linda server in order to be placed in the TS; they require computation. Passive tuples, also called data tuples, are results of the computation, data values (i.e., actual parameters) that are stored in the TS and which are used for process co-ordination. By evaluation an active tuple becomes a passive tuple. During this process all entries in the active tuple are evaluated in order where each entry in the tuple is an expression. Once the entries have been evaluated, the "passive" result replaces the original expression in the tuple. When a process is complete, its tuple becomes passive data (i.e., actual values) stored in the TS. Therefore active tuples can contain a combination of formal and actual values while passive tuples can only contain actual values. A short overview of basic Linda operations can be found in Appendix A
TSpaces is a Java-based
intermediary built upon the Linda TS coordination model with added middleware
extensions developed by IBM at the Almaden Research Centre. It is a
network communication buffer with database capabilities which enables
communication between applications and devices in a network of heterogeneous
computers and operating systems
. The communication becomes asynchronous
and anonymous. TSpaces incorporates database features, such as transactions,
persistent data, flexible queries and XML support. In summary, it can be used
for bringing network services to small (palm-top computers) and embedded
systems [Lehman et al.,
2001]. As TSpaces is implemented in Java (running on all platforms from
JDK1.0), it inherits the ability both to run on virtually any platform and to
download new data types and new functionality dynamically.
TSpaces provides a space where tuples are added and removed. In TSpaces there are two ways to define tuples:
The client-server communication is currently built of a separate socket connection per client. This socket is then shared for all interactions between that client and any number of spaces on the server (A TSpace server may contain many Tuple Spaces). It is moreover possible to have multiple transactions active simultaneously. All it needs are multiple instances of the Tuple Space all pointing to the same space. A short overview of basic TSpaces operations can be found in Appendix A.
JavaSpaces [Freeman et al., 1999] is a Java-based implementation of tuplespaces by Sun Microsystems, in which tuples are represented as serialized objects. The use of Java allows heterogeneous clients and servers to interoperate, regardless of their processor architectures and operating systems. JavaSpaces adds transactional semantics, tuples leasing and event notification. In JavaSpaces tuples are called entries. A process that reads or takes an entry can invoke the operations that are associated with the entry. A Client operates on a JavaSpace to write new entries, look up existing entries, and remove entries from the space. This allows storage of 'state' by Java programs. In particular when security restrictions means that applets are not allowed to store state on client machines, then the state may be stored on servers. JavaSpaces is based on a value-matching lookup routine for specified fields. A short overview of basic JavaSpaces operations can be found in Appendix A.
To be written
To be written
Triple Space Computing (TripleSpace) [Fensel, 2004], follows the same goals for the Semantic Web Services as the Web for humans: re-define and expand current communication paradigm(Cf. Figure 4.1). TripleSpace will become a necessary communication infrastructure where Semantic Web and Semantic Web Services will become true. As [Fensel, 2004] pointed out: "Triple Space may become the web for machines as the web based on HTML became the Web for humans".
Figure 4.1: Evolution of communication mechanisms for humans and machines
TripleSpace is based on the evolution and integration of several well-known technologies such as Tuple Space Computing [Gelernter, 1992], Shared Object Space (http://www.tecco.at/en/eProducts.html), Persistent Message-based Architecture [Alonso et. al., 1995], Semantic Web and in particular RDF Schema [Brickley and Guha, 2004]. It is derived from the communication model of Tuple Space Computing; instead of sending messages back and forth much simpler means of communication can be provided. Processes can write, delete, and read tuples from a global persistent space.
The reason for adopting Tuple Space Communication in TripleSpace is because of its inherent novel characteristics as listed below.
These decoupling has obvious design advantages for defining reusable, distributed, heterogeneous, and quickly changing applications as promised by Web services technology. Also, complex APIs of current Web services technology will boil down to a read and write operation in a tuple space. It is worth to note that a service paradigm based on the tuple space paradigm also revisits the web paradigm; information is persistently written on a global place where other processes can smoothly access it without starting a cascade of message exchanges.
Figure 4.2: Three separate dimensions of cooperation [Angerer, 2002]
However, [Johanson and Fox, 2004] reports some shortcomings of the current tuple space models. They lack any means of name spaces, semantics, unique identifiers and structure in describing the information content of the tuples. This tuple space provides a flat and simple data model that does not provide nesting, therefore, tuples with the same number of fields and field order but different semantics cannot be distinguished.
We propose a simple and promising solution for this problem extending the tuple space into a triple space, where <subject, predicate, object> triples describe content and semantics of information. The object can become a subject in a new triple and so defining a graph structure thereby capturing the structural information. Fortunately, with RDF [Klyne and Carroll, 2004] this space already exists on the web and provides a natural link from the space-based computing paradigm into the semantic web. Notice that the semantic web is also not becoming unnecessary based on the Tuple-spaced paradigm, i.e. we envision current Semantic Web technologies and Tuple Space as the building blocks for a "Global Semantic Space" on the web. The global space can help overcome heterogeneity in communication and cooperation: The Tuple Space simply brought to a global scale only does not provide any answer to data and information heterogeneity. Nevertheless, this aspect is what the semantic web is all about, and what we aim to fruitfully combine.
To make this vision real, two core components are currently identified. The first one is the technical infrastructure that it will include a distributed persistent mechanism for storing triples and should also offer the following desirable features: asynchronous communication, reliability, high-availability, robustness and security. The second, core component is the ontological infrastructure that will provide a semantic specification of the domain where business processes will interact each other.
As a relevant part of the Service Oriented Architecture (SOA), notification is expected to play an essential role in the development of asynchronous, loosely-coupled and dynamic systems, where entities receive messages based on their registered interest in certain occurrences or situations. There is a long tradition in the use of these kinds of technologies in the areas of distributed objects, message oriented middleware (MOM) and Peer to Peer systems. Recently, two new specifications, WS-Notification [Graham and Niblett, 2004] and WS-Eventing [Geller, 2004], bring again the publish-subscribe communication paradigm to the fore.
The publish-subscribe paradigm is an asynchronous, many-to-many communication for distributed systems [Eugster et al., 2003]. The model define two main roles for participants: source, which generates notifications; and sink, which expresses its interest in concrete event notifications or pattern of event notifications. Typically a source can act as a producer and as a publisher. Producers encode information into notification messages, while publishers make accessible these notifications. Similarly, a sink can act as a subscriber and as a consumer. Consumers express interest in concrete notifications and consume those notifications when are published by a source. Subscribers are responsible for registering the consumer’ interests.
In the case of a loosely-coupled configuration (independent producers, publishers, subscribers and consumers), publish-subscribe model de-couples the processes involved in information exchange in four orthogonal dimensions (partially adapted from [Eugster et al., 2003]) :
Figure 5.1: Space, reference, time and flow decoupling (To be added)
Publish-subscribe mechanisms can be classified in two major categories: topic-based or content-based. Topic-based publish-subscribe system is the earlier variant of the publish-subscribe model in which each notification is published with respect to one of a fixed set of topics, also referred as groups, channels, or subjects. A publisher labels each message that producers generate with a particular topic. Similarly, a subscriber targets consumer's subscriptions with a particular topic. Thus, consumers get all messages associated with that topic. In a content-based system, producers use a predefined message schema to create messages, and consumers submits the subscription as a query (or filter, or pattern) against the message schemas. One advantage of a content-based system is that the consumers have the flexibility to choose multiple filtering criteria instead of being limited to pre-defined topics. However, message schemas in content-based system are under the trade-off between scalability and expressiveness. The more expressive message schema, the more difficult it is to evaluate it.
WS-Notification [Graham and Niblett, 2004] is part of the Web Service Resource Framework (WSRF) [Globus et. al., 2004], a new proposal to extend the dominant Open Grid Service Infrastructure (OGSI) ( [Foster et. al.,2002] , [Tuecke et. al., 2003]) by integrating Web Services technologies. The WS-Notification specification is composed of a set of specifications: WS-BaseNotification [Graham and Niblett, 2004a], WS-BrokeredNotification [Graham and Niblett, 2004b] and WS-Topics [Graham and Niblett, 2004c]. WS-BaseNotification defines the Web services interfaces and standardizes message exchanges for Notification-Producers and Notification-Consumers. WS-Brokered Notification specification defines the Web services interface for the Notification-Broker. A Notification-Broker is an intermediary mechanism between Notification-Producers and Notification-Consumers. It is responsible to distribute notifications produced by Notification-Producers to interested Notification-Consumers based on their subscription specifications. WS-Topics defines a mechanism to organize and categorize subscriptions and notifications based on a hierarchical set of topics. WS-Notification currently also uses two related specifications from the WSRF specification: WS-ResourceProperties [Graham, 2003] to describe data associated with resources, and WS-ResourceLifetime [Frey Graham, 2004] to manage lifetimes associated with subscriptions and publisher registrations(in WS-BrokeredNotifications).
On the other hand, WS-Eventing [Geller, 2004] can be considered as a subset of the WS-Notification specification, and more precisely, roughly equivalent to WS-BaseNotification. Differences arise between both specifications: complexity of the specifications, message definitions, delivery modes, subscription operations, Topic Space management and publishing. A detailed analysis of both proposals can be founded in [Pallickara and Fox, 2004].
A fundamental problem of these two specifications, and in general of most publish/subscribe systems, is how match the interests of consumers with the available notifications generated by producers. Simple strings such as“Weather/Warnings” complex XPath or SQL queries do not provide enough expressivity to perform a sophisticated matching of interests and data. However, in [Li and Jiang, 2004] , a proposal for a Semantic Message Oriented Middleware based in DAML+OIL is presented to overcome this limitation. We agreed with Li and Jiang that a formal knowledge representation language for expressing sophisticated classifications and for executing automated inference can be the way to go in future implementations of publish-subscribe systems.
Triple Space paradigm has the same limitation than tuple space for readers(consumers) processes. An application which wants to read a concrete triple or set of triples has to interrupt the main process flow or run a concurrent thread that periodically checks if the data is available. JavaSpaces and TSpaces provide a simple notification mechanism to mitigate the problem. Thus publish-subscribe model can complement Tuple-Space with a sophisticated notification and subscription mechanism that allow a proper asynchronous interaction from the consumers/reader side. Figure 5.2. shows an example of a simple producer-consumer interaction between two processes. Process B is the consumer and search for the data before de data is available in the space. Process A publishes the data in the space. In the left side, process B queries the space (and blocks main flow) until data is available. In the right side, process B is subscribed to the data, and when the data is available the publish-subscribe mechanism send a notification to process B indicating that the data is available, and can be collected.
Figure 5.2: Example of interaction in Triple Space and Publish-Subscribe models
On the other hand, Triple Space paradigm provide a more direct way of communication for producers and consumers than publish-subscribe. When a consumer requires some concrete data just a read operation is needed. The same happen with producers (only a write operation is needed to publish information in the space). Furthermore Triple Space can improve publish-Subscribe model by providing a shared persistent space in which store and replicate subscriptions and notifications. Using RDF data model, the space can be more structured and the data can be better described. Scalability issues in publish-subscribe systems motivated by large amount of data embedded in notifications and event storms produced by numerous notifications can be mitigated by an effective use of the shared space paradigm.
Table 5.1 shows a first proposal of a coordination model API for Triple Space with proper extensions for handling publish-subscribe model. This coordination API is inspired by the combination of TSpace API, JavaSpace API and SIENA API [Carzaniga, 1998].
| Extended Triple Space API and description |
|---|
|
|
|
|
|
|
|
|
|
|
|
Table 5.1 Extended Triple Space API
The interaction model Triple Space Computing together with proper extensions for handling publish-subscribe model envision a new paradigm communication for Semantic Web services. Given the fact that WSMX Architecture relies on main principles of Service Oriented Architecture (SOA), the benefit of the integration of Triple Space and publish-subscribe model in WSMX is threefold. Firstly, WSMX can take advantage of the features of TripleSpace and publish-subscribe models in order to achieve easier several main goals:
Secondly, the presence of Triple Space as a part of the WSMX Architecture allows components to interact through a semantic platform. Invocation and interoperation between components can be done by reading and writing data semantically described. Adding semantic to WSMX architecture drives the evolution of the system from Service Oriented Architecture (SOA) towards Semantic Service Oriented Architecture (SSOA).
Finally, the definition of a real scenario (the interaction between the WSMX components) where we will study the applicability of Triple Space from a practical point of view will provide many useful experiences that should contribute to develop the concept of Triple Space from a bottom-up approach.
In section 2, a brief description of the conceptual view of WSMX architecture was provided. In this section, we delve inside the WSMX core in order to understand how Triple Space could be integrated as a part of the WSMX architecture.
WSMX Core [Haselwanter, 2005] is a microkernel implementation based on the JMX specification. There are three layers currently defined in JMX (Cf. Figure 6.1) which are briefly described below:
Figure 6.1: JMX Levels [Haselwanter, 2005]
The WSMX microkernel is a management agent according to the JMX specification, and it is responsible for loading and configuring, executing, monitoring WSMX components. In the current specification hot-deployment is possible to be achieved, avoiding the limitation of stoping the execution of the system anytime that we want to load a new component. Remote consoles for administration purposes can interact with the kernel through WSDM (Web Services Distributed Management), HTTP (Hypertext Transfer Protocol) and RMI (Remote Method Invocation) adapters. An important element that it is tightly integrated with the microkernel is the workflow engine that it is in charge to identify and execute the appropriate set of operations (execution logic) that can resolve a concrete request received in the system. Currently four execution semantics are defined:
Three main elements can be identified in a standard WSMX component [Haselwanter, 2005] (Cf. Figure 6.2):
The Java Service Wrapper encapsulates the functionalities of different components in three different modules: the reviver module, the proxy module and the transport module. The transport module implements the low level details of the communication. There will be a transport module for each communication infrastructure supported inside of WSMX (queues, tuple space, triple space, etc).
The reviver module manages the interaction of each of the threads that represent a running instance of every component(s)with the transport layer. The reviver define for each instance of every component the set of relevant events that should be attended.
The proxy module make possible the interaction between components and simulates asynchronous communication between them. The wrapper of each component contains as many proxies as components registered to the WSMX Core. Each of these proxies included a description of the interface of the component that represents. When a component needs to interact with other component, the related proxy will provide the name of the methods that can be invoked and a description of the associated parameters that should be provided. The suitable proxy stops the thread, packages the synchronous call and ask the transport module to put the information in an appropriate queue or in the triple space. Because several threads can be associated to a single component, the component communication behavior looks asynchronous although each threads only supports synchronous communication.
Figure 6.2: Anatomy of a WSMX Component [Haselwanter, 2005]
In a local WMSX configuration, all the components are executed in the same machine or at least in the same Local Area Network (LAN). Currently, the WSMX Core designers want to distinguish between the data flows related with the business logic (execution of components based on the requirements of a concrete operational semantic) and the data flows related with the management logic (monitoring the components, load-balancing, instantiation of threads, etc).
Our proposal, take this distinction into account and distribute the WSMX information flows in two main communications mechanism (Cf. Figure 6.3): JMX notifications (management logic) and TripleSpace (business logic).
Figure 6.3: Triple Space in a local WSMX configuration (To be added)
A local configuration (same machine) of WSMX Kernel and components will require a simplified version of Triple Space infrastructure Cf. Figure 6.4). Sophisticated mechanisms for providing remote access (through HTTP for instance), security and trust will not be necessary.
The Transport module of the wrapper of each component will access the Triple Space through simple APIs. These APIs will implement the operations described in section 5.1 . Triple Space infrastructure for WSMX components will be implemented around 6 main components: management module, publish-subscribe module, query module, data module, resource handler layer and security module. The management module will coordinate all the reading and writing requests received; will dispatch the request to the appropriate functional module (data module or query module); will monitor the appropriate execution of the rest of the elements of the system; and will periodically check the coherence of the information stored in the space. The monitoring activities will be collected and evaluated to determine which components are producing or getting more data. This information can be combined with the activity data that WSMX kernel is collecting for each of the components (threads, memory, etc) in their JVM. The result of this merging can provide a complete picture of the workload of the system, and can be the basement in which builds mechanisms for balancing the workload. The publish-subscribe module (motivated in section 5.1 ) is in charge to collect and store subscriptions and advertisements from consumers and producers. Anytime that a triple is published in the space, this module will check if related subscriptions are stored. In case that there are related subscriptions, publish-subscribe module will notify to the consumers of those subscriptions that there are triples available. Based on the management information collected by the management module, the publish-subscribe module can prioritize the order of notifications and deliver first to those components which have less workload. The query module will verify the correctness (syntax level) of the query received based on an standard query language (to be defined). The data module will execute all the operations that are related with the manipulation of data in the space (basically writing, modifing and deleting the triples). The version module will track and store the changes performed by the Data module, and will provide a versioning support to identify different versions, to re-construct a previous version of the space, etc.
It is important to take into account that query and data manipulation operations are not performed by the Query and Data modules directly. Instead, the resource handler will hide heterogeneity by providing a uniform view of different repositories (RDBMS, ODMS, Memory, etc.). For instance, this module will transform a query from the Query Module into the concrete query language used to query a repository). The repositories should provide native data manipulation operations and query/reasoning mechanisms.
Finally, the security module maintains an ontology with the access rights of each component to any of the elements stored in the Space. For instance, when a query is executed the results are filtering to verify that only authorized data will be delivered to a specific component.
Figure 6.4: Interaction of a local component with the triple space
To be added and discuss and include a proposal for a distributed WSMX configuration).
The communication model used in the current implementation of WSMX is synchronous. Synchronous communication is beneficial when immediate responses are required. Since WSMX is dealing with Web service Discovery, Mediation and Invocation, the immediate responses are usually not available. The reasons for such high response latency being network congestion, slow processing, third party invocation, etc. In such situations, the synchronous communication will be costly as it forces the system (component) to remain idle until the response is available. In order to minimize such overhead imposed by synchronicity, TripleSpace can serve as a communication channel between WSMXs (Cf. Figure 7.1) thereby introducing synchronicity between communicating parties. The TripleSpace supports purely asynchronous communication that optimizes performance as well as communication robustness. Enabling asynchronous communication between WSMXs brings them a step closer to their architectural goal i.e., to support greater modularization, flexibility and decoupling between communicating WSMXs. Similarly, it enables WSMX to be highly distributed and easily accessible. Furthermore, being a third party element TripleSpace has added advantage for resolving any communication disputes that may arise.
Figure 7.1: Communication between WSMXs through TripleSpace
Before going into detail how WSMXs can communicate through TripleSpace, a simple logical sequence of basic communication between a WSMX and a TripleSpace is described below.
In order to enable such communication, following two basic operations are defined in Triple Space API:
write: the write operation puts a set of triples into the TripleSpace. The signature of the write operation is the following:
write { <Triples>, <spaceAddress> }
The parameters Triples and spaceAddress specify the data to be stored in the TripleSpace located at spaceAddress. The Triples are RDF triples and the spaceAddress is a valid URL. An example Triples (written in Notation 3 [Berners-Lee] format) of a goal description for buying a book can be found in Appendix A.
read: the read operation gets a set of triples from the TripleSpace. The signature of the read operation is the following:
<Triples> read { <spaceAddress>, <query> }
The parameters spaceAddress and query specify the location of the TripleSpace from where to get the data as specified in the query. A example query that can be executed in read operation is listed in Appendix A. Both read and write operations are atomic in nature.
The information stored in TripleSpace are in the form of RDF triples. Multiple users may exist for a TripleSpace. A subset of these users may be interested for a particular information stored in the TripleSpace that is produced by an specific user. Therefore, unlike in TS, not only the information but also the source (context) of this information is important in TripleSpace communication. In addition, context information may be necessary for tracking the source of information stored in the TripleSpace. Tracking the source is one of the fundamental requirements when the quality of the information is important [Guha et al., 2004]. Though each RDF triples can be uniquely identified through URIs, it is not possible to relate these triples with other factors that influence its existence, e.g., source of the triples. Techniques like reification are introduced in order to embed context information in RDF triples. However, usage of reification introduces a significance performance overload. At least one extra triple would be necessary for providing context information to each triples. One possible solution to this problem could be to extending RDF triples to RDF quads. However, it is against the spirit of RDF graph model.
In WSMX TripleSpace, the context information is captured by adding an extra node called 'context node' (Cf. Figure 9). Where context node can either be an empty node or may accommodate context id. The addition of context node and handling of context information is done within application. Handling context information through application eliminates the risk of breaking RDF graph model. In order for enabling Triple Space Manager to handle context information, the write operation defined above is extended as follows.
write { <Triples>, <spaceAddress>, <contextInfo> }
However, the read may remain unchanged as the context information, if needed, can be formulated in the query parameter.
Figure 9: Context Information with context node
In its current state, this document presents only some sections of what we intend to do in this deliverable. Future work will provide elaborated specification of the TripleSpace functionalities that will be used to support asynchronous communication between WSMXs.
[Alonso et. al., 1995], G. Alonso, D. Agrawal, A. El Abbadi, C. Mohan, R. Günthör, M. Kamath: Exotica/FMQM: A Persistent Message-Based Architecture, In the proceedings of IFIP Working Conference on Info Sys for Decentralized Organizations, Trondheim, August 1995; Also available as IBM Research Report RJ9912, IBM Almaden Research Center, November 1994.
[Angerer, 2002], B. Angerer: Space Based Computing: J2EE bekommt Konkurrenz aus dem eigenen Lager, Datacom, no 4, 2002.
[Berners-Lee, 1998], T. Berners-Lee: An RDF language for the Semantic Web: Notation 3 http://www.w3.org/DesignIssues/Notation3.html
[Brickley and Guha, 2004], D. Brickley and R.V. Guha (eds.): RDF Vocabulary Description Language 1.0: RDF Schema, W3C Recommendation, February 2004, http://www.w3c.org/TR/rdf-schema/
[Carzaniga, 1998], A. Carzaniga: Architectures for an Event Notification Service Scalable to Wide-area Networks. PhD Thesis. Politecnico di Milano. December, 1998.
[Eugster et al., 2003], P.Th. Eugster, P.A. Felber, R. Guerraoui, A.-M. Kermarrec: The Many Faces of Publish/Subscribe, ACM Computing Survey, 2003.
[Fensel D., 2004], D. Fensel: Triple-based Computing. DERI Research Report, 2004-05-31.
[Foster et. al., 2002] I. Foster, C. Kesselman, J. Nick, S. Tuecke (2002). The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002.
[Foster et al., 2001] I. Foster, C. Kesselman, and S. Tuecke: The anatomy of the grid, International Journal of Supercomputing Applications, 2001.
[Frey and Graham, 2004] J. Frey and S. Graham (editors). Web Services Resource Lifetime (WS-ResourceLifetime) Version 1.1. Technical Specification. 2004. http://www.ibm.com/developerworks/library/ws-resource/ws-resourcelifetime.pdf
[Gelernter, 1985], D. Gelernter, N. Carriero, S. Chang: Parallel Programming in Linda, Proceedings of the International Conference on Parallel Processing, 1985.
[Gelernter, 1992], D. Gelernter: Mirrorworlds, Oxford University Press, 1992.
[Geller, 2004] , A. Geller (editor). Web Services Eventing (WS-Eventing). Technical Specification. August 2004. http://www-106.ibm.com/developerworks/webservices/library/specification/ws-eventing/
[Globus et. al., 2004], Globus Alliance, IBM and HP. The Web Services Resource Framework (WSRF). http://www.globus.org/wsrf/
[Graham, 2003], S. Graham (editor). Web Services Resource Properties (WS-ResourceProperties) Version 1.1. Technical Specification. 2003. http://www-106.ibm.com/developerworks/library/ws-resource/ws-resourceproperties.pdf
[Graham and Niblett, 2004], S. Graham and P.Niblett (editors). Web Services Notification (WS-Notification). Technical Specification. 2004. http://www-106.ibm.com/developerworks/library/specification/ws-notification/
[Graham and Niblett, 2004a], S. Graham and P.Niblett (editors). Web Services Base Notification (WS-BaseNotification). Technical Specification. 2004. http://www-106.ibm.com/developerworks/library/specification/ws-notification/
[Graham and Niblett, 2004b] , S. Graham and P.Niblett (editors). Web Services Brokered Notification (WS-BrokeredNotification). Technical Specification. 2004. http://www-106.ibm.com/developerworks/library/specification/ws-notification/
[Graham and Niblett, 2004c] , S. Graham and P.Niblett (editors). Web Services Topics (WS-Topics). Technical Specification. 2004. http://www-106.ibm.com/developerworks/library/specification/ws-notification/
[Haselwanter, 2005], T. Haselwanter: WSMX Core. Bachelor Thesis. University of Innsbruck.
[Herzog et al., 2004], R. Herzog, P. Zugmann, M. Stollberg, and D. Roman(ed.): WSMO Registry. WSMO Working Draft D10,http://www.wsmo.org/2004/d10/v0.1/
[Johanson and Fox, 2004], B. Johanson and A. Fox: Extending Tuplespaces for Coordination in Interactive Workspaces, Journal of Systems and Software, 69(3), January 2004:243-266.
[Klyne and Carroll, 2004], G. Klyne and J. J. Carroll (eds.): Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C Recommendation, February 2004, http://www.w3.org/TR/rdf-concepts/
[Lehman et al., 2001], T. J. Lehman, A. Cozzi, Y. Xiong, J. Gottschalk, V. Vasudevan, S. Landis, P. Davis, B. Khavar, and P. Bowman: Hitting the distributed computing sweet spot with TSpaces, Special Issue on Computer Networks, 35(2001):457-472.
[Li and Jiang, 2004] , H. Li and G. Jiang. Semantic message oriented middleware for publish/subscribe networks. Proceedings of the SPIE, Volume 5403, pp. 124-133 (2004).
[Oren, 2004], E. Oren. WSMX Execution Semantics. WSMO Working Draft v0.1,http://www.wsmo.org/2005/d13/d13.2/v0.2/
[Oren et. al., 2004], E. Oren, M. Moran and M. Zaremba: Overview and Scope of WSMX, WSMO Working Draft v0.1, http://www.wsmo.org/2004/d13/d13.0/v0.1/
[Pallickara and Fox, 2004], S. Pallickara and G. Fox (2004). An Analysis of Notification Related Specifications for Web/Grid applications. Technical Report. October 30 2004
[Roman et al., 2004], D. Roman, H. Lausen, and U. Keller. Web Services Modeling Ontology Standard. WSMO Working Draft v02, http://www.wsmo.org/2004/d2/v0.2/
[Tuecke et. al., 2003] , S. Tuecke, K. Czajkowski, I. Foster, J. Frey, S. Graham, C. Kesselman, T. Maguire, T. Sandholm, P. Vanderbilt, D. Snelling (2003). Open Grid Services Infrastructure (OGSI) Version 1.0. Global Grid Forum Draft Recommendation, 6/27/2003
[Zaremba et al, 2004], M. Zaremba, A. Haller, Maciej Zaremba, and M. Moran: WSMX - Infrastructure for Execution of Semantic Web Services, 2004
The work is funded by the European Commission under the projects DIP, Knowledge Web, Ontoweb, SEKT, and SWWS; by Science Foundation Ireland under the DERI-Lion project; and by the Austrian government under the CoOperate programme.
The authors would like to thank to all the members of the WSMO working group for their advises and inputs to this document, and in particular to Thomas Haselwanter.
Linda has four essential functions to access modify and delete tuples in the TS, which may be implemented in any programming language to form a Linda dialect of that language. A short overview of these functions is given below.
Since TS is an "associative memory", tuples do not have a physical address. A tuple can be identified and retrieved by matching its data fields with the elements found in the requesting template. Matching is carried out according to the following criteria. A tuple to be retrieved must contain the same number of elements, the same types of elements in the same order, and possibly the same values as the elements contained in the template. For example a tuple <"numbers", 2, 2.1> in the TS can be referenced by a template requesting a tuple whose first element is a string, its second element is and integer and its third element is a float, by using the following operation: rd(?s, ?i, ?f). Once the match has been found, s is assigned "numbers", i is assigned 2, f is assigned 2.1 and this data tuple is stored in TS. The tuples can be matched by using any combination of its ordered element values and/or types. [Christian, 1997] (todo reference)
The actual parameters appearing in a tuple collectively constitute a structured name. For example in(P, 2, j:Boolean) requests a tuple with a structured name "P, 2". Structured naming is in principle similar to a "select" operation in a relational database, and can make TS content addressable. Any component of a tuple, except the initial name-valued actual may be a formal like in this case: out(P, i:integer, FALSE). Actuals in templates match tuple actuals and formals. Formals in templates cannot match formals in tuples.
Linda differs from other distributed programming languages in the fact that it is a "coordination language" with the following characteristics:
These characteristics can benefit distributed applications on the Web.
The processes of a distributed application share a single namespace and a single TS. The internal representation of tuple names in one application is prefixed with program IDs to protect the set of tuples from unwanted reference, augmentation or deletion from another program. In order for an application to access names in the namespace of another application, names of tuples may be exported from one application to another for use in out() statements only. The kernel can enforce limitations on the uses to which these names can be put by trapping the in() and rd() statements that occur outside the application which has exported the namespace
In addition to the well known Linda implementation for the SBN network at Stony Brook other implementations have provided solutions to Tuple Space related implementation issues. For example in order to overcome the problem of having multiple concurrent readers of a set of tuples, Antony Rowston developed a primitive for Linda called "copy-collect", which makes use of multiple tuple spaces. Also, he developed Bonita in order to overcome the problems of high latency in TS based coordination languages and to provide the ability to request multiple tuples and act on them as they arrive. With this solution an in() operation is decomposed into a request for a tuple and a receive tuple. Added primitives block until a result tuple is available.
Tuples can be accessed and modified using a simple Java API. The TSpaces basic operations set [Lehman et al.,2001] used to access, modify and delete the tuples in the TS is given below:
Here is an example of how easy is to create a TS and write a tuple into the TS.
String host = "localhost";
TupleSpace ts = new TupleSpace("Example1",host);
Field f1 = new Field("Key1");
Field f2 = new Field("Data1");
Tuple t1 = new Tuple();
t1.add(f1);
t1.add(f2);
ts.write(t1);
TSpaces extend the basic Linda TS framework with relational data management, event notification, access controls features and the ability to download both new data types and new semantic functionality. Any client application providing a service to the user can be added to Tspaces (e.g email service, printing service, pager service and so on). System upgrades can be performed while the TSpaces server is running. This reduces costly downtime for system upgrades.
Future development of TSpaces has been stopped after it was officially declared a success in 2001. Currently TSpaces is being used as a communication paradigm for component interaction and management to explore Grid Computing in the IBM OptimalGrid project. [Foster et al., 2001]
A short overview of basic JavaSpaces opreations, to access and modify tuples is given below:
All the operations that were mentioned before are performed in a transactionally secure manner using the two-phase commit model [JavaSpace, 2003]. JavaSpaces services can provide a reliable distributed storage system for the entries.
JavaSpaces also uses features of the Jini [Arnold et al., 1999] network technology, such as leases, transactions and events.
Jini is an open architecture based on idea of federating groups of users and the resources required by those users [Jini, 1999]. Using Jini technology one can build adaptive networks where many devices can join in a scalable way.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix wsml: <http://www.wsmo.org/wsml#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
wsml:buyBookGoal rdf:type wsml:goal .
wsml:buyBookGoal wsml:nfp _:p1 .
_:p1 dc:title "Purchase Order Book Ontology"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p2 .
_:p2 dc:creator "DERI International"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p3 .
_:p3 dc:description "Purchase Order of Book"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p4 .
_:p4 dc:publisher "DERI International"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p5 .
_:p5 dc:contributor "Brahmananda Sapkota"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p6 .
_:p6 dc:date "2005-05-10"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p7 .
_:p7 dc:type "domain ontology"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p8 .
_:p8 dc:format "text"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p9 .
_:p9 dc:language "en-us"^^xsd:string .
wsml:buyBookGoal wsml:nfp _:p10 .
_:p10 dc:rights <http://www.deri.org/privacy.html> .
wsml:buyBookGoal wsml:nfp _:p11 .
_:p11 dc:version "1.3"^^xsd:string .
wsml:buyBookGoal wsml:uesedMediators _:m1 .
_:m1 wsml:useMediator <http://www.wsmo.org/med1> .
wsml:buyBookGoal wsml:uesedMediators _:m2 .
_:m2 wsml:useMediator <http://www.wsmo.org/med2> .
wsml:buyBookGoal wsml:uesedMediators _:m3 .
_:m3 wsml:useMediator <http://www.wsmo.org/med3> .
wsml:buyBookGoal wsml:uesedMediators _:m4 .
_:m4 wsml:useMediator <http://www.wsmo.org/med4> .
wsml:buyBookGoal wsml:hasPostconditon _:x .
_:x wsml:hasAxiom <http://www.wsmo.org/goals/buyBookGoal#address> .
_:x wsml:hasLogicalExpression "full_address:address[address_number->number:address_number[name->12], street->street_name:street[name->Street 1], city->city:location[name->Galway]]"^^xsd:string .
wsml:buyBookGoal wsml:hasPostconditon _:y .
_:y wsml:hasAxiom <http://www.wsmo.org/goals/buyBookGoal#product> .
_:y wsml:hasLogicalExpression "product_data:Book[Book->Book_name:Book[name->Book, Harry Potter#3], price->price:price[name->Cheap]]"^^xsd:string .
wsml:buyBookGoal wsml:hasPostconditon _:z .
_:z wsml:hasAxiom <http://www.wsmo.org/goals/buyBookGoal#test3> .
_:z wsml:hasLogicalExpression "input is input"^^xsd:string .
<> ql where {
wsml : buyBookGoal rdf : type wsml : goal .
?x ?p ?o .
} .
This query retrieves all wsml:buyBookGoal Triples that are of type wsml:goal
| Date | Version | Author | Description |
|---|---|---|---|
| 2005/05/16 | 0.1 | Francisco | Added comments of Thomas Haselwanter on section 6 |
| 2005/05/16 | 0.1 | Francisco | rewrote section 5 and added comments of last review on section 6 |
| 2005/05/16 | 0.1 | Francisco | Added changelog |
| 2005/05/16 | 0.1 | B. Sapkota | Section 7: Added handling context information |
| 2005/05/08 | 0.1 | B. Sapkota | Section 7: Example moved to Appendix A |