de.hpi.fgis.dude.algorithm.duplicatedetection
Class DuplicateCountSNM

java.lang.Object
  extended by de.hpi.fgis.dude.util.AbstractCleanable
      extended by de.hpi.fgis.dude.algorithm.AbstractAlgorithm
          extended by de.hpi.fgis.dude.algorithm.AbstractDuplicateDetection
              extended by de.hpi.fgis.dude.algorithm.SortingDuplicateDetection
                  extended by de.hpi.fgis.dude.algorithm.duplicatedetection.DuplicateCountSNM
All Implemented Interfaces:
Algorithm, Cleanable, AutoJsonable, Iterable<DuDeObjectPair>

public class DuplicateCountSNM
extends SortingDuplicateDetection

AdaptiveWindowSizeSNM implements the Adaptive-Window-Size Sorted-Neighborhood Method that was introduced by Oliver Wonneberg.

Author:
Fabian Lindenberg

Nested Class Summary
static class DuplicateCountSNM.AdaptionMode
          This enumeration collects all the modes which can be used.
static class DuplicateCountSNM.AdaptiveWindowSizeSNMBuilder
          The AdaptiveWindowSizeSNM.AdaptiveWindowSizeSNMBuilder maintains the adaptable window size of the AdaptiveWindowSizeSNM.
protected  class DuplicateCountSNM.AdaptiveWindowSizeSNMIterator
          AdaptiveWindowSizeSNMIterator implements the behavior of the Adaptive-Window-Size SNM algorithm.
static class DuplicateCountSNM.ComparisonResult
          The comparison of a DuDeObjectPair can either yield a DUPLICATE or a NON_DUPLICATE
 
Nested classes/interfaces inherited from class de.hpi.fgis.dude.algorithm.AbstractAlgorithm
AbstractAlgorithm.AlgorithmIteratorWrapper
 
Constructor Summary
protected DuplicateCountSNM()
          For serialization
 
Method Summary
protected  Iterator<DuDeObjectPair> createIteratorInstance()
          Returns a new Iterator instance.
protected  float getAbortThreshold()
          Returns the abort threshold.
protected  DuplicateCountSNM.AdaptionMode getAdaptionMode()
          Returns the set adaptation mode.
protected  float getIncreaseFactor()
          Returns the set increase factor.
protected  float getIncreaseThreshold()
          Returns the threshold for increasing the window size.
protected  DuplicateCountSNM.ComparisonResult getNotification()
          Returns the category that was set for the last processed pair.
protected  int getWindowSize()
          Returns the current window size.
protected  boolean isAbortIncrease()
          Checks whether aborting the increase is enabled.
 void notifyOfLatestComparisonResult(DuplicateCountSNM.ComparisonResult comparisonResult)
          Notifies the algorithm, whether the latest object pair has been categorized as a duplicate or a non-duplicate
protected  void resetNotification()
          Resets the last notification
 void setIncreaseFactor(float increaseFactor)
          Sets the increase factor.
 void setIncreaseThreshold(float increaseThreshold)
          Sets the increase threshold.
 void setWindowSize(int windowSize)
          Sets the window Size.
 void unlockInstance()
          Unlocks the instance.
 
Methods inherited from class de.hpi.fgis.dude.algorithm.SortingDuplicateDetection
getSortingKey, preprocessData, setSortingKey
 
Methods inherited from class de.hpi.fgis.dude.algorithm.AbstractDuplicateDetection
addSource, dataSourceAttached, equals, getData, getDataSize, getMaximumPairCount, hashCode, iterator, unregisterDataSources
 
Methods inherited from class de.hpi.fgis.dude.algorithm.AbstractAlgorithm
addDataSource, addPreprocessor, addPreprocessor, analyzeDuDeObject, createStorage, dataExtracted, disableInMemoryProcessing, enableInMemoryProcessing, finishExtraction, finishPreprocessing, forceExtraction, getDataSize, getExtractedData, inMemoryProcessingEnabled
 
Methods inherited from class de.hpi.fgis.dude.util.AbstractCleanable
cleanUp, registerCleanable, registerCloseable
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.hpi.fgis.dude.util.Cleanable
cleanUp, registerCleanable, registerCloseable
 

Constructor Detail

DuplicateCountSNM

protected DuplicateCountSNM()
For serialization

Method Detail

unlockInstance

public void unlockInstance()
Unlocks the instance. This instance is being locked, when an iteration process was initiated. After the iteration process is finished (i.e. the last element was returned), the instance will be automatically unlocked. Calling any setter methods while the instance is locked leads to a ConcurrentModificationException. If the iteration process is not finished, yet, but any parameter shall be changed, call this method beforehand.


createIteratorInstance

protected Iterator<DuDeObjectPair> createIteratorInstance()
Description copied from class: AbstractDuplicateDetection
Returns a new Iterator instance.

Specified by:
createIteratorInstance in class SortingDuplicateDetection
Returns:
The Iterator instance.

notifyOfLatestComparisonResult

public void notifyOfLatestComparisonResult(DuplicateCountSNM.ComparisonResult comparisonResult)
Notifies the algorithm, whether the latest object pair has been categorized as a duplicate or a non-duplicate

Parameters:
comparisonResult - The category.

resetNotification

protected void resetNotification()
Resets the last notification


getNotification

protected DuplicateCountSNM.ComparisonResult getNotification()
Returns the category that was set for the last processed pair.

Returns:
The category of the last processed pair.

getAdaptionMode

protected DuplicateCountSNM.AdaptionMode getAdaptionMode()
Returns the set adaptation mode.

Returns:
The adaptation mode.

getWindowSize

protected int getWindowSize()
Returns the current window size.

Returns:
The current window size.

getIncreaseThreshold

protected float getIncreaseThreshold()
Returns the threshold for increasing the window size.

Returns:
The threshold.

getIncreaseFactor

protected float getIncreaseFactor()
Returns the set increase factor.

Returns:
The increase factor.

isAbortIncrease

protected boolean isAbortIncrease()
Checks whether aborting the increase is enabled.

Returns:
true, if aborting is enabled; otherwise false.

getAbortThreshold

protected float getAbortThreshold()
Returns the abort threshold.

Returns:
The threshold for aborting the increase.

setWindowSize

public void setWindowSize(int windowSize)
Sets the window Size.

Parameters:
windowSize - The new window size.
Throws:
ConcurrentModificationException - If this method is called while an iteration process was not finished, yet.

setIncreaseThreshold

public void setIncreaseThreshold(float increaseThreshold)
Sets the increase threshold.

Parameters:
increaseThreshold - The new increase threshold.
Throws:
ConcurrentModificationException - If this method is called while an iteration process was not finished, yet.

setIncreaseFactor

public void setIncreaseFactor(float increaseFactor)
Sets the increase factor.

Parameters:
increaseFactor - The new increase factor.
Throws:
ConcurrentModificationException - If this method is called while an iteration process was not finished, yet.


Copyright © 2011 Hasso Plattner Institute - Chair of Information Systems. All Rights Reserved.