Technical components & services
Technical services are all elements of the platform that help to glue together the multiple elements that compose a WebLab application.
The service bus takes in charge the core messaging functionality which enables the communication on the service layer. It is mainly composed of a service registry, a normalised router, service engines and binding components. A service bus provides a service abstraction layer. Instead of directly call web services or other components, like shown in the illustration on the left, we declare them as logical named endpoints to the bus. Thanks to this approach, we can use logical endpoints to integrate components instead of directly create hard links between components, like shown in the right illustration.
This component can register services with metadata. These metadata are mainly the service interface, the service name and a unique identifier. Using this component, the service bus is able to search for the right service using one of these metadata. These metadata are expressed using the WSDL standard. Thanks to this approach, any endpoint deployed on the bus as an abstract WSDL description, even if the endpoint reports to a non web service component.
This component is also natively provided and used by the service bus and also by the service registry. The router is able to locate and invoke abstract endpoints and to dynamically resolve links between abstract services and concrete services.
Service engines manipulate messages and abstract endpoints. For instance a component which is able to transform a message into another one using some mapping functions will be implemented as a service engine. Another example of a service engine is orchestrators which manipulate messages, call abstract services and aggregate them into one new abstract endpoint.
Binding components manage connection to the external world. A binding component takes in charge the connectivity between abstract endpoints and external services using the right protocol. Thanks to this approach every integrated component is exposed inside the bus registry using a unified description, WSDL. So, the binding component maps external components to WSDL compatible endpoints.
We already presented service engines and took the example of an orchestrator. But, in this section, we will discuss about its integration and usage inside the bus. An orchestrator is a piece of software which is able to call services following a structural definition. There were already a lot of works on orchestration, mainly focused on definition of a standard to describe the structure of an orchestration. With the emergence of all the Web service stacks (WSDL / SOAP / HTTP) and their massive adoption, a new standard WS-BPEL (Web Service Business Execution Language) has been proposed. This standard uses XML to describe orchestration processes and define some structural elements, like variables, loops, conditional pass, choose, etc.Quickly, software editors have provided graphical user interfaces to generate XML WS-BPEL. Of course, we can cite others standards to orchestrate processes, like XPDL or BPMN, but WS-BPEL is still the most adopted one.
However, WS-BPEL uses directly WSDL definition, so it's impossible to use it to integrate legacy systems inside a WS-BPEL process. To tackle this issue, we can combine a service bus with an orchestrator. As we already mentioned it, a service bus integrates all components by linking them to a logical registry using WSDL. In other words, all integrated systems will be defined using WSDL inside the bus, thanks to the binding components. So, an orchestrator directly integrated inside the bus as a service engine will be able to call directly each declared logical service. Moreover, each endpoint is declared using WSDL, so a BPEL engine is able to call each integrate endpoint as if it was a web service. This freshly created BPEL process will be exposed as a logical endpoint and provide a WSDL to invoke it like any other component.
Sometimes BPEL could be too much complicated, for example, if an architect has to test quickly a composition process and does not want to use all the BPEL features. In this case, using a service bus simplifies the writing of this type of composition. Instead of trying to call every component directly, with all the different protocols and programming languages, simply instantiate a new engine inside the bus, with all the needed functionalities like the service registry and the normalized router. Thanks to this approach you can efficiently write complex processes using your preferred language like Java or even Groovy without looking at how to use a legacy system which uses an old protocol. Simply manipulate XML messages and invoke logical services.
Documents and data repositories
Before being analysed, documents collected from various are in most of the case stored as "raw" content of an unknown format such as standard document formats (.html, .doc, .odt, .txt ...), images (.jpeg, .png, .gif ...), videos (.avi, .flv, .mpeg ...) and more. The goal of the processing chain is to normalised and structure the information contained in such heterogeneous documents. However before reaching high level semantic descriptions, the raw content need to be manipulated and exchanged between the various processing service.
Depending on each applications, the exchange of raw content could use multiple solutions. Base on the experience of multiple projects, we tried some different possibilities (from SOAP encoded byte arrays to FTP exchanges) and currently we suggest 2 different implementations:
- File URI scheme: the simplest one relies on a standard network shared file that will store all content. The XML description of a resource will thus support a link to this content in the form of a "file://" URL. The main advantage of this solution is its efficiency and simple implementation. However it introduce a major drawback : the need to share a file system among the multiple servers that could hosts the components and services of the platform. Moreover it will be more complex to implement this solution in a remotely distributed environment.
- WebDAV protocol: another recommended implementation is to use the WebDAV protocol. The deployment of a common WebDAV server shared in the platform is needed but it allows any components to save and read content from a remote location. Based on the HTTP protocol, it does not need an extra network set-up and rights management could be handled.
The platform suggest to use standard implementation such as JackRabbit. An easy to use and easy to integrate JAVA client component compatible with both implementations (see ContentManager). However it is up to the application designer to select the best implementation.
WebLab documents are described in XML, which is the natural format of Webservice messages. Storage of such descriptions could be done in any format and the platform propose a dedicated interface for this sake, called the ResourceContainer which propose a simple but efficient file system-based storage.
Metadata and semantic data
Knowledge and other data manipulated in the platform are expressed in XML-RDF. To store such information, knowledge repositories will implement the Indexer and Searcher interfaces to be able to handle SPARQL requests.