de.hpi.fgis.voidgen.hadoop.tasks
Class ClusterPatterns

java.lang.Object
  extended by de.hpi.fgis.voidgen.hadoop.Driver
      extended by de.hpi.fgis.voidgen.hadoop.tasks.ClusterPatterns
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class ClusterPatterns
extends Driver

Creates the patterns of entities of the different clusters. The patterns are URI patterns of the elements contained by each cluster.
The pattern generation was developed for the URI based clustering. Creating the patterns can be done on every clustering but we assume the most useful patterns for clusters generated by the URI based clustering (compared to other clustering methods).

    Hints:
  1. Have a look how on ClusterInfoPatternStep1Reducer to see the influences on the patterns if an other clustering except URI based clustering is used.
  2. There is unexpected behavior in ClusterInfoPatternStep2Reducer leading to possible loss of some patterns if there exist multiple patterns for the same URI part.

The following table lists the properties necessary to set.

property name description example value
de.hpi.fgis.voidgen.hadoop.tasks.ClusterPatterns.input_paths The input paths containing RDF quadruples with subject and object cluster set. voidGen/joined<
de.hpi.fgis.voidgen.hadoop.tasks.ClusterPatterns.temporary_path The path for temporary MapReduce output. voidGen/temporary_path<
de.hpi.fgis.voidgen.hadoop.tasks.ClusterPatterns.output_path The output path containing the patterns for the clusters. voidGen/patterns
de.hpi.fgis.voidgen.hadoop.tasks.clusterinformation.ClusterInfoPatternStep1Reducer.alternatives Optional. The default value is '5'. The number of alternatives allowed in URLs at each level of depth within this URL. If there are more alternatives at a given level of depth a wild-cards sign is used. 5

Author:
Dandy Fenz, Hasso Plattner Institute at University of Potsdam, Germany, Matthias Pohl, Hasso Plattner Institute at University of Potsdam, Germany, Johannes Gosda, Hasso Plattner Institute at University of Potsdam, Germany

Constructor Summary
ClusterPatterns()
           
 
Method Summary
 int run(java.lang.String[] args)
           
 
Methods inherited from class de.hpi.fgis.voidgen.hadoop.Driver
getConf, getPath, getPaths, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClusterPatterns

public ClusterPatterns()
Method Detail

run

public int run(java.lang.String[] args)
        throws java.lang.Exception
Throws:
java.lang.Exception