WebLab EGC2012 Application

From WebLab Wiki
Jump to navigationJump to search

This version is a simple package of the WebLab application that has been demonstrated during EGC demonstration conference.

This example provides the following features:

  • documents/webpages search and indexation
  • documents/webpages structure and format support
  • documents/webpages management through collection tagging

(see note about features support depending on your operating system)

To download this application, see EGC Application.

Quick start

  1. Download the archive containing the EGC application from EGC Application download.
  2. Unzip EGC_Application-1.1.zip
  3. go to weblab-egc
  4. launch run.sh ou run.bat from a console

It will open your web browser to http://localhost:8080 and you should have the following page after around 2 minutes (depending on your configuration): First page

Install

To install this application, unzip the archive and you are done !

With a Linux distribution, you can also add structure support either with following script or manually.

Script

Run this script: egc_tools_install.sh


Manual

If you want to activate structure processing on documents, you have to install other tools on your system:

You can only activate this feature with Linux.

Detailed installation processes for these tools is available :

Launching

Go to the new directory weblab-egc and launch the script:

With Windows: > run.bat

With Linux: $ ./run.sh


Description

You can upload any document to the first page by dragging and dropping your files in the specified area (dark blue area in the first page).

This document will then be processed. Once this process is done, you will automatically be redirected on the document view (third screenshot).

To search for an element, click on the Result View tab. Write your query in the Document Search portlet. Results will be shown in the Results portlet.

Screenshots

  • Main first page:

egc-main.png

  • Search and results page:

egc-search.png

  • Annotated document page:

egc-docview.png


Remarks

  • Note that structure processing only works only with Linux for the moment, you need to install libreoffice/openoffice, pdftohtml and wkhtmltopdf to activate this feature !
  • webpages support is only available with Linux

Technical details

This application uses the following services:

  • pdf-generator
  • pdf-normaliser
  • tika-normaliser
  • ngramj-language-extraction
  • gate-extraction
  • simple-file-repository
  • solr-indexer

This application uses the following portlets:

  • collection-viewer
  • custom-launch-viewer
  • results-viewer
  • annotated-document-viewer


Sources are available on our svn in the following branch:

svn://svn.forge.objectweb.org/svnroot/weblab/branches/WebLabApplications/EGC2012


External tools used: