Structure Normaliser Service/1.0
From WebLab Wiki
| Details | |
|---|---|
| Service Interfaces | Analyser |
| Exchange model: | WebLab 1.2.2 |
| Versions: |
1.0
|
| Licence | LGPL 2.1 |
| Supported OS | Linux |
| Binary | pdf-normalizer-service-1.0-SNAPSHOT.war |
| Sources | pdf-normalizer-service-1.0-SNAPSHOT-sources.jar |
| Javadoc | pdf-normalizer-service-1.0-SNAPSHOT-javadoc.jar |
| SVN | pdf-normalizer |
| Maven Artifact | |
|
<groupId>org.ow2.weblab.webservices</groupId> <artifactId>pdf-normalizer-service</artifactId> <version>1.0-SNAPSHOT</version> | |
| Release Note | |
This service will retrieve the dc:source annotation from a resource and it will try to convert it in a PDF format. This service supports document from file system or http/ftp document from the web. Its goal is to normalise any multimedia document into WebLab resources without losing all the structure information.
This service will create WebLab MediaUnit and it will refer to structure ontology to annotate position, format and style into WebLab annotations.
Contents |
Installation
You can install pdftohtml with the following script: egc_tools_install.sh
At the following prompt:
$ ./egc_tools_install.sh What do you want to install ? All (by default, press enter) 1. only Libreoffice 2. only wkhtmltopdf 3. only pdftohtml
select 3, then press enter.
It will ask for your administrator's password to download required libraries, compile pdftohtml from sources and install it on your system.
Remark: most of existing linux operating systems already have this library installed. However those versions are too old, compatible versions of poppler will start from 0.20.x.
Configuration
none.
UsageContext effects
none.
Examples of SOAP Input/Output
Known Limitations
The Structure Normaliser can only process PDF files for the moment.
Dependencies
org.ow2.weblab.webservices:pdf-normalizer-service:war:1.0 +- jdom:jdom:jar:1.0:compile +- jaxen:jaxen:jar:1.1:compile | +- dom4j:dom4j:jar:1.6.1:compile | +- xml-apis:xml-apis:jar:1.3.02:compile | \- xom:xom:jar:1.0:compile | \- xerces:xmlParserAPIs:jar:2.6.2:compile +- org.ow2.weblab.core.helpers:rdf-helper-jena:jar:1.3.2:compile | \- com.hp.hpl.jena:jena:jar:2.6.4:compile | +- com.hp.hpl.jena:iri:jar:0.8:compile | +- com.ibm.icu:icu4j:jar:3.4.4:compile | +- org.slf4j:slf4j-api:jar:1.5.8:compile | \- org.slf4j:slf4j-log4j12:jar:1.5.8:runtime +- org.apache.commons:commons-exec:jar:1.1:compile +- org.jsoup:jsoup:jar:1.6.1:compile +- net.sf.mime-util:mime-util:jar:1.2:compile | \- log4j:log4j:jar:1.2.14:runtime +- org.ow2.weblab.core.helpers:rdf-helper-jena-selection:jar:1.5.3:compile | \- joda-time:joda-time:jar:1.6.2:compile +- commons-io:commons-io:jar:2.0.1:compile +- org.ow2.weblab.components:content-manager:jar:1.8.4-SNAPSHOT:compile +- xerces:xercesImpl:jar:2.7.1:compile +- commons-codec:commons-codec:jar:1.6:compile +- org.ow2.weblab.core:model:jar:1.2.2:compile +- org.ow2.weblab.core:extended:jar:1.2.2:compile +- org.ow2.weblab.core:annotator:jar:1.2.4:compile +- org.apache.cxf:cxf-rt-frontend-jaxws:jar:2.4.0:compile | +- xml-resolver:xml-resolver:jar:1.2:compile | +- asm:asm:jar:3.3:compile | +- org.apache.cxf:cxf-api:jar:2.4.0:compile | | +- org.apache.cxf:cxf-common-utilities:jar:2.4.0:compile | | +- org.apache.ws.xmlschema:xmlschema-core:jar:2.0:compile | | \- org.apache.neethi:neethi:jar:3.0.0:compile | | +- wsdl4j:wsdl4j:jar:1.6.2:compile | | \- org.codehaus.woodstox:woodstox-core-asl:jar:4.1.1:compile | | \- org.codehaus.woodstox:stax2-api:jar:3.0.2:compile | +- org.apache.cxf:cxf-rt-core:jar:2.4.0:compile | | +- com.sun.xml.bind:jaxb-impl:jar:2.1.13:compile | | \- org.apache.geronimo.specs:geronimo-javamail_1.4_spec:jar:1.7.1:compile | +- org.apache.cxf:cxf-rt-bindings-soap:jar:2.4.0:compile | | +- org.apache.cxf:cxf-tools-common:jar:2.4.0:compile | | \- org.apache.cxf:cxf-rt-databinding-jaxb:jar:2.4.0:compile | +- org.apache.cxf:cxf-rt-bindings-xml:jar:2.4.0:compile | +- org.apache.cxf:cxf-rt-frontend-simple:jar:2.4.0:compile | \- org.apache.cxf:cxf-rt-ws-addr:jar:2.4.0:compile +- org.apache.cxf:cxf-rt-transports-http:jar:2.4.0:compile | +- org.apache.cxf:cxf-rt-transports-common:jar:2.4.0:compile | \- org.springframework:spring-web:jar:3.0.5.RELEASE:compile | +- aopalliance:aopalliance:jar:1.0:compile | +- org.springframework:spring-beans:jar:3.0.5.RELEASE:compile | +- org.springframework:spring-context:jar:3.0.5.RELEASE:compile | | +- org.springframework:spring-aop:jar:3.0.5.RELEASE:compile | | +- org.springframework:spring-expression:jar:3.0.5.RELEASE:compile | | \- org.springframework:spring-asm:jar:3.0.5.RELEASE:compile | \- org.springframework:spring-core:jar:3.0.5.RELEASE:compile +- xalan:xalan:jar:2.7.1:compile | \- xalan:serializer:jar:2.7.1:compile +- commons-logging:commons-logging:jar:1.1.1:compile +- junit:junit:jar:4.8.2:test \- javax.servlet:servlet-api:jar:2.4:provided

