Service descriptions play a crucial role in several tasks in Service-oriented Architecture (SOA), e.g., for service discovery, service selection, service composition, etc. However, it has been observed that service providers – the main source of information about web services – typically release poor service descriptions, i.e., mostly technical-oriented descriptions. To tackle this lack of rich service descriptions, several approaches have been proposed to enrich poor service descriptions with additional information from other sources, e.g., community annotations, domain experts, etc. In our research, we introduce a novel approach to gather and generate additional information about web services used in collaborative application domains from multiple sources and integrate them in unified, rich service descriptions. These sources are: Websites of service providers, invocation analysis, and business processes of service consumers.
We have developed a focused crawler that automatically collects public web services from the websites of their providers and registers them in our public service registry, Depot. In addition to web services, textual annotations are extracted from the same crawled web pages and attached to their corresponding web services. This information represents the technical perspective of web services.
With the increasing number and complexity of web services, the importance of instance-level information about web services has increased. Such instance-level information, e.g., examples of input values, formats, etc., are usually not incorporated in service descriptions. We have introduced a set of methods to analyze invocations of data web services and extract useful instance-level information about them based on their invocations, e.g., tags, valid values for inputs, etc.
Traditionally, service consumers might provide ratings and feedback about web services they use in their business processes. Although this information is useful, additional valuable information can be generated from business processes, e.g., context and consuming tasks. To extract this knowledge about web services from their consuming business processes, we integrate Depot with the open-source online modeling framework Oryx. The information extracted from business processes of service consumers represents the business perspective of the used web services.
The information we gather and generate from the aforementioned sources is used to enrich poor service descriptions. With these enriched service descriptions, several tasks can be enhanced, e.g., service discovery, selection, recommendation, etc. For instance, service discovery can be enhanced by enabling service exploration through automatic categorization and tagging of web services. Moreover, we are able to provide multi-dimensional service exploration by combining the category, provider, and tags of web services and enable additional refinement through keywords for full-text search.