|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object de.hpi.fgis.dude.postprocessor.StatisticComponent
public class StatisticComponent
StatisticComponent
provides functionality for gathering statistics concerning the recall, precision and f-measure. Therefore a
collection real duplicates has to be added.
Field Summary | |
---|---|
protected long |
actualComparisonCount
|
protected Algorithm |
algorithm
|
protected boolean |
checkMemory
|
protected Date |
endDate
|
protected long |
endTime
|
protected long |
falseNegativesByComparison
|
protected long |
falsePositives
|
protected long |
falsePositivesByComparison
|
protected GoldStandard |
goldStandard
|
protected long |
memoryCheckFrequency
|
protected static int |
NO_STATISTIC_VALUE
|
protected long |
pairCount
|
protected Date |
startDate
|
protected long |
startTime
|
protected MemoryCheckerTask |
task
|
protected Timer |
timer
|
protected long |
trueNegativesByComparison
|
protected long |
truePositives
|
protected long |
truePositivesByComparison
|
Constructor Summary | |
---|---|
protected |
StatisticComponent()
Internal constructor for Jsonable deserialization. |
|
StatisticComponent(Algorithm algorithm)
Initializes a StatisticComponent with no gold standard. |
|
StatisticComponent(GoldStandard goldStandard,
Algorithm algorithm)
Initializes a StatisticComponent using the passed DuDeObjectPair s as real duplicates. |
Method Summary | |
---|---|
void |
addDuplicate(DuDeObjectPair pair)
Adds a DuDeObjectPair to the knowledge base that is labeled as a detected duplicate. |
void |
addDuplicate(DuDeObjectPair pair,
boolean actualComparison)
Adds a DuDeObjectPair to the knowledge base that is labeled as a detected duplicate. |
void |
addDuplicate(Iterable<DuDeObjectPair> pairs)
Adds several DuDeObjectPair s to the knowledge base that are labeled as detected duplicates. |
void |
addNonDuplicate(DuDeObjectPair pair)
Adds a DuDeObjectPair to the knowledge base that is labeled as a detected non-duplicate. |
void |
addNonDuplicate(DuDeObjectPair pair,
boolean actualComparison)
Adds a DuDeObjectPair to the knowledge base that is labeled as a detected non-duplicate. |
void |
addNonDuplicate(Iterable<DuDeObjectPair> pairs)
Adds several DuDeObjectPair s to the knowledge base that are labeled as a detected non-duplicates. |
void |
addPair(DuDeObjectPair pair,
boolean positive)
Adds a DuDeObjectPair to the knowledge base. |
void |
addPair(Iterable<DuDeObjectPair> pairs,
boolean positive)
Adds several DuDeObjectPair s to the knowledge base. |
protected void |
checkMemoryUsage()
Starts Memoryusage task |
String |
getAverageMemoryUsed()
Gets the registered average amount of memory used during the experiment. |
long |
getComparisonCount()
Returns the number of pairs that were already compared. |
Date |
getEndDate()
Gets the date of the specified end time of an algorithm. |
long |
getFalseNegatives()
Returns the false negatives count. |
long |
getFalseNegativesByComparison()
Returns the false negatives count that are explicitly classified by the comparator. |
long |
getFalsePositives()
Returns the false positives count. |
long |
getFalsePositivesByComparison()
Returns the false positives count that are explicitly classified by the comparator.. |
double |
getFMeasure()
Returns the f-measure based on the current knowledge base. |
double |
getFMeasureByComparison()
Returns the f-measure based on the current knowledge base and the actual comparisons. |
String |
getMaximumMemoryUsed()
Gets the registered maximum amount of memory during the experiment. |
long |
getMemoryCheckFrequency()
Gets the frequency of memory checks. |
String |
getMinimumMemoryUsed()
Gets the registered minimum amount of memory during the experiment. |
long |
getNumberOfCandidateComparisons()
Returns the maximum number of pairs that would be generated by the naive approach. |
long |
getNumberOfRealDuplicates()
Returns the size of the gold standard. |
long |
getObjectCount()
Returns the number of records that were processed by the algorithm. |
long |
getPairCount()
Returns the number of pairs that were already considered. |
double |
getPrecision()
Returns the precision based on the current knowledge base. |
double |
getPrecisionByComparison()
Returns the precision based on the current knowledge base and the actual comparisons. |
double |
getRecall()
Returns the recall based on the current knowledge base. |
double |
getRecallByComparison()
Returns the recall based on the current knowledge base and the actual comparisons. |
double |
getReductionRatio()
Returns the reduction ratio based on the current knowledge base. |
double |
getReductionRatioByComparison()
Returns the reduction ratio based on the current knowledge base and the actual comparisons. |
long |
getRuntime()
Gets the time difference between beginning time and finishing time. |
Date |
getStartDate()
Gets the date of the specified start time of an algorithm. |
long |
getTrueNegatives()
Returns the true negatives count. |
long |
getTrueNegativesByComparison()
Returns the true negatives count that are explicitly classified by the comparator. |
long |
getTruePositives()
Returns the true positives count. |
long |
getTruePositivesByComparison()
Returns the true positives count that are explicitly classified by the comparator.. |
boolean |
goldStandardSet()
Checks whether a gold standard was passed. |
boolean |
hasGMD()
Checks whether this StatisticComponent calculates the Generalized Merge Distance. |
boolean |
isCheckMemory()
Gets the boolean flag that indicates the activation status of memory checking. |
boolean |
isDuplicate(DuDeObjectPair pair)
Returns true if the DuDeObjectPair exists in the set of real duplicate pairs. |
boolean |
isNonDuplicate(DuDeObjectPair pair)
Checks whether a specific pair exists in the set of real duplicate pairs. |
void |
reset()
Sets the attributes for TruePositives, FalsePositives, TruePositivesByComparison, FalsePositivesByComparison, TrueNegativesByComparison, FalseNegativesByComparison, PairCount and ComparisonCount to 0. |
void |
setBeginningTime()
Deprecated. Replaced by setStartTime() |
void |
setCheckMemory(boolean checkMemory)
Sets the boolean flag that indicates the activation status of memory checking. |
void |
setEndTime()
Sets current time as finishing time for the runtime. |
void |
setFinishingTime()
Deprecated. Replaced by setEndTime() |
void |
setMemoryCheckFrequency(long memoryCheckFrequency)
Gets the frequency of memory checks. |
void |
setStartTime()
Sets the current time as starting time for the runtime and initiates memory monitoring. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final int NO_STATISTIC_VALUE
protected GoldStandard goldStandard
protected long truePositives
protected long falsePositives
protected long truePositivesByComparison
protected long falsePositivesByComparison
protected long trueNegativesByComparison
protected long falseNegativesByComparison
protected long pairCount
protected long actualComparisonCount
protected long startTime
protected Date startDate
protected long endTime
protected Date endDate
protected Algorithm algorithm
protected boolean checkMemory
protected transient MemoryCheckerTask task
protected transient Timer timer
protected long memoryCheckFrequency
Constructor Detail |
---|
public StatisticComponent(Algorithm algorithm)
StatisticComponent
with no gold standard.
algorithm
- The used algorithm.public StatisticComponent(GoldStandard goldStandard, Algorithm algorithm)
StatisticComponent
using the passed DuDeObjectPair
s as real duplicates.
goldStandard
- The gold standard which these statistics are based on.algorithm
- Used algorithm.protected StatisticComponent()
Jsonable
deserialization.
Method Detail |
---|
public boolean hasGMD()
public boolean goldStandardSet()
true
, if a gold standard was set; otherwise false
.public long getTruePositives()
true positives
count.
true positives
count.public long getFalsePositives()
false positives
count.
false positives
count.public long getTrueNegatives()
true negatives
count.
true negatives
count.public long getFalseNegatives()
false negatives
count.
false negatives
count.public long getTruePositivesByComparison()
true positives
count that are explicitly classified by the comparator..
true positives
count that are explicitly classified by the comparator..public long getFalsePositivesByComparison()
false positives
count that are explicitly classified by the comparator..
false positives
count that are explicitly classified by the comparator..public long getTrueNegativesByComparison()
true negatives
count that are explicitly classified by the comparator.
true negatives
count that are explicitly classified by the comparator.public long getFalseNegativesByComparison()
false negatives
count that are explicitly classified by the comparator.
false negatives
count that are explicitly classified by the comparator.public long getPairCount()
public long getComparisonCount()
public void addPair(DuDeObjectPair pair, boolean positive)
DuDeObjectPair
to the knowledge base. The pair is counted as comparison.
pair
- The pair that shall be considered in the statistics.positive
- true
, if the passed pair was detected as an duplicate; otherwise false
.public void addPair(Iterable<DuDeObjectPair> pairs, boolean positive)
DuDeObjectPair
s to the knowledge base. The pairs are counted as comparisons.
pairs
- The pairs that shall be considered in the statistics.positive
- true
, if the passed pair was detected as an duplicate; otherwise false
.public void addDuplicate(DuDeObjectPair pair)
DuDeObjectPair
to the knowledge base that is labeled as a detected duplicate. The pair is counted as comparison.
pair
- A detected duplicate.public void addDuplicate(DuDeObjectPair pair, boolean actualComparison)
DuDeObjectPair
to the knowledge base that is labeled as a detected duplicate.
pair
- A detected duplicate.actualComparison
- true
, if the pair should be counted as comparison; otherwise false
.public void addDuplicate(Iterable<DuDeObjectPair> pairs)
DuDeObjectPair
s to the knowledge base that are labeled as detected duplicates. The pairs are counted as comparisons.
pairs
- The pairs that shall be considered as detected duplicates in the statistics.public void addNonDuplicate(DuDeObjectPair pair)
DuDeObjectPair
to the knowledge base that is labeled as a detected non-duplicate. The pair is counted as comparison.
pair
- The pair that shall be considered as a detected non-duplicate in the statistics.public void addNonDuplicate(DuDeObjectPair pair, boolean actualComparison)
DuDeObjectPair
to the knowledge base that is labeled as a detected non-duplicate.
pair
- The pair that shall be considered as a detected non-duplicate in the statistics.actualComparison
- true
, if the pair is an actual comparison; otherwise false
.public void addNonDuplicate(Iterable<DuDeObjectPair> pairs)
DuDeObjectPair
s to the knowledge base that are labeled as a detected non-duplicates. The pairs are counted as comparisons.
pairs
- The pairs that shall be considered as detected non-duplicates in the statistics.@Deprecated public void setBeginningTime()
setStartTime()
public void setStartTime()
public Date getStartDate()
Date
object@Deprecated public void setFinishingTime()
setEndTime()
public void setEndTime()
public Date getEndDate()
Date
objectpublic long getRuntime()
public double getPrecision()
public double getRecall()
public double getFMeasure()
public double getReductionRatio()
public double getPrecisionByComparison()
public double getRecallByComparison()
public double getFMeasureByComparison()
public double getReductionRatioByComparison()
public long getNumberOfRealDuplicates()
public long getObjectCount()
public long getNumberOfCandidateComparisons()
public boolean isDuplicate(DuDeObjectPair pair)
true
if the DuDeObjectPair
exists in the set of real duplicate pairs.
pair
- The duplicate pair that is to be checked.
true
, if the duplicate pair exists in the set of real duplicate pairs.public boolean isNonDuplicate(DuDeObjectPair pair)
pair
- The duplicate pair that is to be checked.
false
, if the duplicate pair exists in the set of real duplicate pairs.public long getMemoryCheckFrequency()
public void setMemoryCheckFrequency(long memoryCheckFrequency)
memoryCheckFrequency
- Frequency of memory checks in ms. Default value is 5000 ms.public boolean isCheckMemory()
public void setCheckMemory(boolean checkMemory)
checkMemory
- Is set to False if memory checking should not be performed. Default value is True.protected void checkMemoryUsage()
public String getMaximumMemoryUsed()
public String getMinimumMemoryUsed()
public String getAverageMemoryUsed()
public void reset()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |