de.hpi.fgis.dude.algorithm.duplicatedetection
Class AdaptiveSNM_Yan2007.YanIterator

java.lang.Object
  extended by de.hpi.fgis.dude.util.AbstractIterator<DuDeObjectPair>
      extended by de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007.YanIterator
All Implemented Interfaces:
Iterator<DuDeObjectPair>
Direct Known Subclasses:
AdaptiveSNM_Yan2007.AA_SNM_Iterator, AdaptiveSNM_Yan2007.IA_SNM_Iterator
Enclosing class:
AdaptiveSNM_Yan2007

protected abstract class AdaptiveSNM_Yan2007.YanIterator
extends AbstractIterator<DuDeObjectPair>

Abstract Iterator implementation that is used by the different adaptive Sorted Neighborhood methods.

Author:
Uwe Draisbach

Field Summary
protected  Queue<DuDeObject> blockQueue
          Queue that contains the records of the current block
protected  Iterator<DuDeObject> blockQueueIterator
          Iterator for the records of the current block
protected  uk.ac.shef.wit.simmetrics.similaritymetrics.InterfaceStringMetric comparator
          comparator used to calculate the distance between two sorting keys
protected  DuDeObject currentRec
          current record that is used to create record pairs with all other records in the block
protected  Iterator<DuDeObject> dataIterator
          Iterator for the extracted records from the data sources
protected  float phi
          the distance threshold
protected  Queue<DuDeObject> recordQueue
          Queue with already extracted records from the dataIterator that are not yet finally assigned to a block.
protected  Iterator<DuDeObject> recordQueueIterator
          Iterator for the recordQueue
protected  SortingKey sortingKey_iterator
          Sorting key for the records
 
Constructor Summary
AdaptiveSNM_Yan2007.YanIterator(SortingKey sortingKey, float phi, Iterator<DuDeObject> dataIterator)
          Constructor
 
Method Summary
protected  float getKeyDistance(DuDeObject first, DuDeObject second)
          Calculates the sorting key distance of two DuDeObjects.
protected  void getNextBlock()
          Calculates the elements of the next block.
protected  DuDeObject getNextRecord()
          Returns the next object from the record queue.
protected abstract  int getNumRecordsOfBlock()
          Calculates the number of records within the next block.
protected  DuDeObjectPair loadNextElement()
          Returns the element of the next iteration step.
protected  boolean nextBlockExists()
          Checks whether a next block exists.
protected  boolean nextRecordExists()
          Checks whether the record queue or the data source has a next element.
 
Methods inherited from class de.hpi.fgis.dude.util.AbstractIterator
hasNext, next, remove
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

phi

protected float phi
the distance threshold


dataIterator

protected Iterator<DuDeObject> dataIterator
Iterator for the extracted records from the data sources


recordQueue

protected Queue<DuDeObject> recordQueue
Queue with already extracted records from the dataIterator that are not yet finally assigned to a block.


recordQueueIterator

protected Iterator<DuDeObject> recordQueueIterator
Iterator for the recordQueue


blockQueue

protected Queue<DuDeObject> blockQueue
Queue that contains the records of the current block


blockQueueIterator

protected Iterator<DuDeObject> blockQueueIterator
Iterator for the records of the current block


currentRec

protected DuDeObject currentRec
current record that is used to create record pairs with all other records in the block


sortingKey_iterator

protected SortingKey sortingKey_iterator
Sorting key for the records


comparator

protected uk.ac.shef.wit.simmetrics.similaritymetrics.InterfaceStringMetric comparator
comparator used to calculate the distance between two sorting keys

Constructor Detail

AdaptiveSNM_Yan2007.YanIterator

public AdaptiveSNM_Yan2007.YanIterator(SortingKey sortingKey,
                                       float phi,
                                       Iterator<DuDeObject> dataIterator)
Constructor

Parameters:
sortingKey - The sorting key used to sort the records.
phi - The threshold used to determine the boundary pairs.
dataIterator - Iterator for the data, that shall be processed.
Method Detail

loadNextElement

protected DuDeObjectPair loadNextElement()
Description copied from class: AbstractIterator
Returns the element of the next iteration step. This method needs to be implemented by each sub-class.

Specified by:
loadNextElement in class AbstractIterator<DuDeObjectPair>
Returns:
The next element.

nextRecordExists

protected boolean nextRecordExists()
Checks whether the record queue or the data source has a next element.

Returns:
True, if a next record exists, otherwise false.

nextBlockExists

protected boolean nextBlockExists()
Checks whether a next block exists.

Returns:
True, if a next block exists, otherwise false.

getNextBlock

protected void getNextBlock()
Calculates the elements of the next block.


getNextRecord

protected DuDeObject getNextRecord()
Returns the next object from the record queue. If this queue does not contain new elements, then will the next element from the data source be extracted.

Returns:
The next DuDeObject in the sorting order.

getKeyDistance

protected float getKeyDistance(DuDeObject first,
                               DuDeObject second)
Calculates the sorting key distance of two DuDeObjects.

Parameters:
first - The first DuDeObject.
second - The second DuDeObject.
Returns:
the sorting key distance

getNumRecordsOfBlock

protected abstract int getNumRecordsOfBlock()
Calculates the number of records within the next block.

Returns:
number of records in the next block


Copyright © 2011 Hasso Plattner Institute - Chair of Information Systems. All Rights Reserved.