de.hpi.fgis.voidgen.hadoop.tasks
Class InputStatistics

java.lang.Object
  extended by de.hpi.fgis.voidgen.hadoop.Driver
      extended by de.hpi.fgis.voidgen.hadoop.tasks.InputStatistics
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class InputStatistics
extends Driver

Reads all RDF quadruples of the input and counts the distinct subjects, predicates, objects, contexts and resources.

The following table lists the properties necessary to set.

property name description example value
de.hpi.fgis.voidgen.hadoop.tasks.InputStatistics.temporary_output_path The output path for the Hadoop job. voidGen/temp
de.hpi.fgis.voidgen.hadoop.tasks.InputStatistics.input_paths The comma-separated list of input paths. voidGen/input2
de.hpi.fgis.voidgen.hadoop.tasks.InputStatistics.hdfs_output_path Optional. The output file containing the statistics. (Path within HDFS) voidGen/output.txt
de.hpi.fgis.voidgen.hadoop.tasks.InputStatistics.s3_output_path Optional. The file in amazon s3 containing the statistics. s3n://my.bucket/input_stats.txt

Author:
Dandy Fenz, Hasso Plattner Institute at University of Potsdam, Germany, Matthias Pohl, Hasso Plattner Institute at University of Potsdam, Germany, Johannes Gosda, Hasso Plattner Institute at University of Potsdam, Germany

Constructor Summary
InputStatistics()
           
 
Method Summary
 int run(java.lang.String[] args)
           
 
Methods inherited from class de.hpi.fgis.voidgen.hadoop.Driver
getConf, getPath, getPaths, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputStatistics

public InputStatistics()
Method Detail

run

public int run(java.lang.String[] args)
        throws java.lang.Exception
Throws:
java.lang.Exception