The driver for aggregating all cluster data
created by various MapReduce jobs. Creates a
description for each created cluster.
To each description additional properties
specified by the user can be added.
Additional properties can be specified as a property
added to the configuration of this job.
This property's name must have the prefix
"de.hpi.fgis.voidgen.hadoop.tasks.datasetdescription.DescriptionAggregationReducer."
and the substring starting after the prefix will
be the name of the set property.
Example:
name: de.hpi.fgis.voidgen.hadoop.tasks.datasetdescription.DescriptionAggregationReducer.voidGen:clusteringAlgorithm
value: uriBasedClustering
will result in every cluster having:
name: voidGen:clusteringAlgorithm
value: uriBasedClustering
The following table lists the properties necessary to set.
property name |
description |
example value |
de.hpi.fgis.voidgen.hadoop.tasks.DataSetDescription.cluster_size_paths |
The output path containing pairs of cluster identifier and cluster size. |
voidGen/clustering2_size |
de.hpi.fgis.voidgen.hadoop.tasks.DataSetDescription.cluster_description_paths |
The output path for the created cluster descriptions (example entity and significant predicates). |
voidGen/descriptions |
de.hpi.fgis.voidgen.hadoop.tasks.DataSetDescription.cluster_pattern_paths |
The output path containing the patterns for the clusters. |
voidGen/patterns |
de.hpi.fgis.voidgen.hadoop.tasks.DataSetDescription.temporary_path |
The temporary output path for adaption MapReduce jobs. |
voidGen/dataset_temp |
de.hpi.fgis.voidgen.hadoop.tasks.DataSetDescription.void_output_path |
The output path containing cluster (data set) descriptions. The descriptions are aggregated from description parts generated by previous MapReduce jobs. |
voidGen/void_descriptions |
de.hpi.fgis.voidgen.hadoop.tasks.datasetdescription.DescriptionAggregationReducer.voidGen:clusteringAlgorithm |
Optional. Additional property for each cluster (data set) naming the clustering algorithm used to create the clusters. |
hierarchical_clustering |
de.hpi.fgis.voidgen.hadoop.tasks.datasetdescription.DescriptionAggregationReducer.voidGen:clusteringPredicate |
Optional. Additional property for each cluster (data set) naming the property used for the hierarchical clustering. |
owl:sameAs |