WebLab 1.2.5/Architecture

From WebLab Wiki
Jump to: navigation, search

This page describes the overall architecture of the WebLab platform. It allows to have a basic understanding of the technical infrastructure selected and the organisation of the major components.

WebLab architecture

Layered model of WebLab distribution.

A Layered model of application composition

The WebLab platform is structured in 3 major layers:

  • WebLab Core: An open source technical baseline (and free to use in any commercial application) acting as a runtime environment for unstructured information processing services. It has been developed by the Advanced Information Processing Department of Airbus Defence and Space (ex Cassidian, ex EADS Defence and Security) and its partners in several projects.
  • WebLab Services: A set of multimedia processing services and GUI either open source or commercial components developed upon the WebLab model and using standardised interfaces (either services interfaces or portlets models) in order to realise specific functions.
  • WebLab Applications: A set of business specific applications either open source or commercial systems build on top of the WebLab using a selected set of services.

Architecture guidelines

The WebLab platform rely on a "Service Oriented Architecture" (SOA) as the core paradigm for the design and integration of components. The high level functions, offered to users through applications, will be achieved by putting together services and calling them in the right sequence.

As a consequence, the service definition and conception is a key feature in the platform. Each component, that will be integrated in the platform, shall implement one (or several) service interfaces described as service level agreements in WSDL. They will offer, to the platform their processing capabilities which will be called by the orchestrator in order to run the business processes, or work-flows, that will deliver the high level function offered to users. The components will be fully autonomous and won't have any knowledge of the other services deployed and consumed by the platform.

The granularity of the services should be one of the main concerns during the design and development of a WebLab component. In order to provide a flexible architecture, the service design should respect the following features:

  • Loosely-coupled: It means that services should be as autonomous as possible and that dependencies between components should be avoided. For his sake, point-to-point communication between services should not be possible and call to services should always goes through the service bus (see hereafter).
  • Coarse-grained: A service should provide a coherent set of functions and should hide implementation complexity.
  • Standardised interfaces: A service should implement one or more standardised interfaces from the service taxonomy which themselves reuse the generic service interfaces. It gives access to the methods provided by the service.
  • Integrable: A service can be easily integrated in a global application and be used in a chain with other services.

Every service is provided by a "producer" (providing a processing capability) to a "consumer" (requesting a process). The interaction between producer and consumer is carried out by a service bus that is responsible of the mediation and the communication between the services. It contains a directory of service to allow the producers to publish their service offers and enable to build complex processing chain (or workflow) selecting the right services at each step. This high level process will then be proposed as service by the service bus. Requests to any service or process goes through the service bus from the consumer to the publisher either in an asynchronous or a synchronous way.

In the architecture, we will consider:

  • Business services that provide business functions (such as video segmentation, text clustering),
  • Technical services that are part of the baseline provided (such as security, data access layer, etc.),
  • Graphical User Interface (GUI) components that will interact with users on one side and with the service bus on the other side to request process or data.

A WebLab application (and thus implementation of use cases) will be implemented by combining business services and technical services within the same process thanks to orchestration tools.

Description of technical layers

The following figure provides a full overview of the WebLab technical architecture. It can be seen as a multi-layer architecture.

Overview of WebLab technical architecture.

Having a top down view, we can describe the architecture with the following layers:

  • An access layer providing to different users:
    • graphical user interfaces included in a web portal, for "final" users. This enable the user to access services and functionalities in an usable way through a simple web browser: the activation of processing chain is then called by triggering interfaces functionalities. This could includes multiple user roles and different functionalities to manage and administrate the application.
    • administration and monitoring application for "admin" users. It enables to control the execution of the processing chains and to check the correct execution of the processing chains (service availability, lack of errors, etc.).
    • business work-flow design application for "architect" users and developers. This last part will include multiple tools such as a work-flow editor enabling to define service chains and tools to support deployment, and configuration. This advanced tools will be reserved to technical people in charge of setting up the global chains for the application.
  • A process layer that is responsible for acting as an intermediary between services in order to perform a business process. The orchestration layer embeds an execution engine that is able to run processes described various languages (previously BPEL or in pure Java, now using Camel EIP).
  • A service bus layer in charge of communication and distribution of messages between services in accordance to the request expressed by the orchestrator. It allows abstraction of services localisation and implementation as well as abstraction of communication protocol. Every services access by orchestrator and/or upper client will use standardised services interfaces (expressed in WSDL) where as the actual services called could be implemented by multiple services (or by a composition of services) located anywhere and possibly duplicated for automatic load balancing.
  • A service layer composed of the services interfaces which rely on the set of standardised interfaces.
  • A component layer that include the actual component hidden behind the standardised services interfaces and that implement the method exposed. The component that are developed for the platform are located right on this layer and this where the document processing stuff are ``really" done.
  • A data layer which is composed of the original data sources, multiple repositories that are used by the platform for the storage of original content of documents, their (XML) desciption, the multiples indexes and the knowledge bases.

On both sides of the architecture illustration, we have two transverse functions which may be implemented in multiple ways in the platform:

  • Security and quality of services which concerns the management of users and servers authentication and access rights while providing ways to access the quality of services provided.
  • Monitoring and supervision which allows to look on the platform stare including services and components status.

Apart from these layers, the platform also include various technical components and services that are part of the baseline and which support the basic functions of an application (such as messaging, data access layer, orchestration, security, etc.).

Further Reading

Data exchange model

Documents and more generally resources, manipulated within a WebLab application will be defined as an object which has a link with the final user interests and will be identified by an URI. The resource concept includes all type of entities which should be processed by several services and also any kind of object that could be useful for a service in a specific task: document, segment of document, queries, etc... All elements of the model are described in the WebLab data exchange model page.

Services interfaces

The service integrated in the platform need to refer to a common scheme and to be as normalised as possible in order to ease the creation and enhancement of processing chains. Using a common data exchange model is a first step but then we need to rationalise the expression of services methods and to group them into functional interfaces. The service interfaces, including the generic interfaces, the services taxonomy and the related standardised services are described in the WebLab services interfaces page.

Graphical user interface

GUIs in WebLab are built using Portlets. These are defined by JSR-286, the Java Portlet API 2.0 (http://www.jcp.org/en/jsr/detail?id=286). The portal recommended is Liferay (http://www.liferay.com/), version 6.0. More information on the WebLab user interfaces page.

Technical components & services

Technical services are all elements of the platform that help to glue together the multiple elements that compose a WebLab application. These are described in a dedicated page : Technical components & services.

TODO: This part has to be completed. This is outdated