Package de.hpi.fgis.voidgen.hadoop.tasks

This package collects all tasks used for creating VoID descriptions.

See:
          Description

Class Summary
ClusterDescription Generates the textual description of clusters.
ClusteringConnectionBased Clusters the input RDF quadruples according to the structure of the graph of interlinked resources.
ClusteringUriBased Clusters the input data only dependent on the subject and object URI.
ClusterPatterns Creates the patterns of entities of the different clusters.
ClusterQuadrupleJoin For each RDF quadruple of the input, the cluster the subject belongs to and the cluster the object belongs to, will be set.
ClusterSizeDriver Counts for each cluster the number of unique nodes belonging to this cluster.
DataSetDescription The driver for aggregating all cluster data created by various MapReduce jobs.
DistinctClustering Selects a single concept type for each subject and assigns the resource to the respective data set.
InputStatistics Reads all RDF quadruples of the input and counts the distinct subjects, predicates, objects, contexts and resources.
KSimilarity Finds k-similiar subjects.
LinkSetDetection Detects and counts link-sets between different data-sets.
ToVoid This class provides converting data set descriptions to data set descriptions in VoID format.
VocabularyDetection Identifies and counts all the vocabularies used in the processed data-set.
 

Package de.hpi.fgis.voidgen.hadoop.tasks Description

This package collects all tasks used for creating VoID descriptions.

The initial input for most of the tasks are RDF quadruples stored in N-Quad format.

The following diagram shows the different tasks and which tasks have to be executed before other tasks.