Difference between revisions of "WebLab Source Connector"

From WebLab Wiki
Jump to navigationJump to search
(Basic implementation)
 
Line 15: Line 15:
 
# send back the resource XML as answer to the call.
 
# send back the resource XML as answer to the call.
  
Thus the final returned XML will look like:
+
Usually the Resource returned will be a Document and thus the final returned XML will look like:
 
<source lang="xml">
 
<source lang="xml">
 
+
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 +
<resource xsi:type="ns3:Document" uri="weblab://openSearch-1974947600/hit8" xmlns:ns3="http://weblab.ow2.org/core/1.2/model#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 +
    <annotation uri="weblab://openSearch-1974947600/hit8#a0">
 +
        <data>
 +
            <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
 +
                <rdf:Description rdf:about="weblab://openSearch-1974947600/hit8" xmlns:wp="http://weblab.ow2.org/core/1.2/ontology/processing#">
 +
                    <dc:title>Discounter-News: Bodenstaubsauger von Norma im Test</dc:title>
 +
                    <dc:source>http://www.discounter-archiv.de/de/news/Bodenstaubsauger-von-Norma-im-Test/</dc:source>
 +
                    <wp:isProducedBy rdf:resource="weblab://searchConnector"/>
 +
                    <wp:hasNativeContent rdf:resource="file:/home/gdupont/Dev/workspaces/ow2space/org.ow2.weblab.service.search-connector-service/target/data/content/weblab.7502704667831429262.content"/>
 +
                </rdf:Description>
 +
        </rdf:RDF>
 +
        </data>
 +
    </annotation>
 +
</resource>
 
</source>
 
</source>
  

Latest revision as of 07:19, 24 May 2012

In order to start processing content, one should obviously get content from a source. Whatever the source is and whatever the content is, the WebLab platform hides it behind the Queue Manager generic interface. It stands simply for what it is and acts as a simple queue that produce resources to be processed one by one. The only method accessible is nextResource() which either return the next Resource or throw an exception when the queue is finished.

TODO: This part has to be completed.

See Category:TODO for more information about using this template.

Specification

Basic implementation

In order to explain the basic implementation, we will take the example of a simple connector to an existing file system. Thus the objective for the connector is to produce one by one the files contained. Thus on each call to the nextResource() method, the service will:

  1. build an XML description of the file with at least:
  • a unique identifier which will be the resource URI;
  • an annotation to link the weblab resource to its original source.
  1. upload the content of the file to the content repository of the application;
  2. add in the XML description the link (ie URL) to the content uploaded;
  3. send back the resource XML as answer to the call.

Usually the Resource returned will be a Document and thus the final returned XML will look like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<resource xsi:type="ns3:Document" uri="weblab://openSearch-1974947600/hit8" xmlns:ns3="http://weblab.ow2.org/core/1.2/model#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <annotation uri="weblab://openSearch-1974947600/hit8#a0">
        <data>
            <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
                <rdf:Description rdf:about="weblab://openSearch-1974947600/hit8" xmlns:wp="http://weblab.ow2.org/core/1.2/ontology/processing#">
                    <dc:title>Discounter-News: Bodenstaubsauger von Norma im Test</dc:title>
                    <dc:source>http://www.discounter-archiv.de/de/news/Bodenstaubsauger-von-Norma-im-Test/</dc:source>
                    <wp:isProducedBy rdf:resource="weblab://searchConnector"/>
                    <wp:hasNativeContent rdf:resource="file:/home/gdupont/Dev/workspaces/ow2space/org.ow2.weblab.service.search-connector-service/target/data/content/weblab.7502704667831429262.content"/>
                </rdf:Description>
        </rdf:RDF>
        </data>
    </annotation>
</resource>

Advanced implementation

Special cases

Existing search service

Examples

JAVA sample