Graphic1

D10.1v0.1 Focused Crawler for Web Service Discovery

WSMO Working Draft 13 April 2005

This version:
http://www.wsmo.org/TR/d10.1/v0.1/20050413
Latest version:
http://www.wsmo.org/TR/d10.1/v0.1/
Previous version:
http://www.wsmo.org/TR/d10.1/v0.1/20050323
Editor:
David Aiken
Authors:
David Aiken

Table of Contents

1. Introduction
1.1 Problem Definition
1.2 Purpose and Overview of this Document
2. Related Approaches
2.1 Keyword-based Discovery
2.2 Discovery based on simple semantic descriptions of Services
2.3 Discovery based on rich semantic descriptions of Services
3. WSML and Registries
3.1 WSML and UDDI
3.2 WSML and ebXML
4. Approach to Web Service Discovery
4.1 Terminology and Concepts
4.2 Mechanism of Web Service Discovery
5. Architecture
5.1 Focused Crawler within WSMX
5.2 Components
5.3 Requirements on other components within WSMX
6. Implementation
6.1 Google SOAP API
6.2 YARS
6.3 Simulation
6.3.1 A semi-automated approach
6.3.2 GUI
6.4 Evaluation
7. Conclusions and Further Directions
References
Acknowledgements
Changelog

1. Introduction

Semantic Web Services are emerging as a promising technology for the effective automation of services discovery, combination, and management [Fensel et al, 2002]. Semantic Web Services aim at leveraging two major trends in Web technologies, namely Web Services and Semantic web. While Web Services technologies have positively influenced the potential of the Web infrastructure by providing programmatic access to information and services, they are hindered by the lack of rich and machine-processable abstractions to describe service properties, capabilities, and behaviour. As a result of these limitations, very little automation support can be provided to facilitate effective discovery, combination and management of services. Instead, current service discovery relies on simple keyword matching against natural language descriptions. Automation support is considered as the cornerstone to provide effective and efficient access to services in large, heterogeneous and dynamic environments.

1.1 Problem Definition and Scope

Semantic web services are currently paving the way to provide the automation and dynamic composition of web service technologies thus reducing the effort that is taken when attempting to integrate applications, businesses and customers. A pivotal concept that enables the location of web services is that known as Discovery-"The act of locating a machine-processable description of a Web service-related resource that may have been previously unknown and that meets certain functional criteria. It involves matching a set of functional and other criteria with a set of resource descriptions. The goal is to find an appropriate Web service-related resource" [Web Services Glossary].

WSMO discovery [Keller et al, 2004] defines a conceptual model for locating services that (totally or partially) fulfill a requestors goal. Three major steps are described in this conceptual model;

Goal Discovery this about abstracting a concrete user goal to a pre-defined, reusable, and formalized goal. The user has to describe, with different levels of accuracy, his requirements and desires. This then enables a previously very individualistic goal become more generic and thus easier to match to the corresponding service.

Web service Discovery is based on matching abstracted descriptions of formailzed goals with semantic annotations of formalized web services and then selecting web services that can potentially be used to get the desired service.

Service Discovery uses the web services matched in the previous step to access the actual services that lay behind web service interfaces. A strong mediation step is required to meet the specific needs of a choreography of a Web service. A choreography defines the sequence and conditions under which multiple cooperating independent agents exchange messages in order to perform a task to achieve a goal state [Web Services Glossary].

This document is concerned only with the second step, Web service Discovery. Once potential web services have been located, the final functionality required by the Focused Crawler is to return a list of URI's. How these URI's are then used is outside the scope of this document. The fundamental requirements of the Focused Crawler is to locate potential web services and return their URI's.

1.2 Purpose and Overview of this Document

The purpose of this document is to open up and examine the potential for actively searching the network for sematically annotated web services, be they in WSDL descriptions or in UDDI/ebXML registries, rather than the in-house effect of querying WSMX relational databases populated with web services that have been registered with WSMX. The latter approach is currently being investigated in D10 WSMX Discovery, D5.1 WSMO Web Service Discovery, D5.2 WSMO Discovery Engine

This document will provide some brief introductions to related approaches to web service discovery in Section 2. The mechanism upon which the Focused Crawler uses to discover web services, and the terminology and concepts surrounding this venture will be discussed in Section 4, with details of Architecture and Implementation in Section 5 and 6, respectively. Section 3 will provide some insight on how registries, such as UDDI and ebXML, could be enriched to provide WSMO ontologies for efficient web service discovery.

2. Related Approaches

2.1 Keyword-based Discovery

2.2 Discovery based on simple semantic descriptions of Services

2.3 Discovery based on rich semantic descriptions of Services

3. WSML and Registries

3.1 WSML and UDDI

3.2 WSML and ebXML

4. Approach to Web Service Discovery

4.1 Terminology and Concepts

4.2 Mechanism of Web Service Discovery

5. Architecture

5.1 Focused Crawler within WSMX

5.2 Components

5.3 Requirements on other Components

6. Implementation

6.1 Google SOAP API

6.2 YARS

6.3 Simulation

6.3.1 A semi-automated approach

6.3.2 GUI

6.4 Evalutation

7. Conclusions and Further Directions

References

[Web Services Gloassry] http://www.w3.org/TR/ws-gloss/#defs.

[Fensel et al, 2002] Fensel, D., Bussler, C., Ding, Y., Omelayenko, B.: The Web Service Modeling Framework WSMF. Electronic Commerce Research and Applications 1 (2002).

[Keller et al., 2004] Keller, U., Lara, R., and Polleres, A., editors (2004). WSMO Discovery. WSMO Working Draft D5.1v0.1. Available from http://www.wsmo.org/2004/d5/d5.1/v0.1/.

 

Acknowledgements

The work is funded by the European Commission under the projects DIP, Knowledge Web, Ontoweb, SEKT, and SWWS; by Science Foundation Ireland under the DERI-Lion project; and by the Austrian government under the CoOperate program.

The editors would like to thank all the members of the WSMO working group for their advice and inputs to this document.

Changelog

The following major updates have been done since the Previous version of this deliverable:

  • chapter 1 content added.
  •  


    Valid XHTML 1.1!

    webmaster