de.hpi.fgis.dude.algorithm
Class AbstractAlgorithm

java.lang.Object
  extended by de.hpi.fgis.dude.util.AbstractCleanable
      extended by de.hpi.fgis.dude.algorithm.AbstractAlgorithm
All Implemented Interfaces:
Algorithm, Cleanable, AutoJsonable, Iterable<DuDeObjectPair>
Direct Known Subclasses:
AbstractDuplicateDetection, AbstractRecordLinkage

public abstract class AbstractAlgorithm
extends AbstractCleanable
implements Algorithm

AbstractAlgorithm implements the functionality that is needed by each algorithm type.

Author:
Matthias Pohl

Nested Class Summary
protected static class AbstractAlgorithm.AlgorithmIteratorWrapper
          AlgorithmIteratorWrapper is used for setting some common properties of the generated DuDeObjectPairs.
 
Constructor Summary
AbstractAlgorithm()
           
 
Method Summary
 void addDataSource(DataSource source)
          Adds a DataSource to the algorithm.
 void addPreprocessor(DataSource source, Preprocessor preprocessor)
          Adds a Preprocessor for a specific DataSource to this algorithm.
 void addPreprocessor(Preprocessor preprocessor)
          Adds a default Preprocessor to this algorithm.
protected abstract  void addSource(DataSource source)
          Adds the DataSource to this instance.
protected  void analyzeDuDeObject(DuDeObject object)
          Initiates the preprocessing for the passed DuDeObject.
protected  DuDeStorage<DuDeObject> createStorage(String name)
          Creates the DuDeStorage instance based on the in-memory-processing flag.
protected  boolean dataExtracted()
          Checks whether the data extraction was already done.
protected abstract  boolean dataSourceAttached(DataSource source)
          Checks whether the passed DataSource is attached to this AbstractAlgorithm instance.
 void disableInMemoryProcessing()
          Disables in-memory processing.
 void enableInMemoryProcessing()
          Enables in-memory processing.
 boolean equals(Object obj)
           
protected  void finishExtraction()
          Sets a flag which indicates that the extraction process is finished.
protected  void finishPreprocessing()
          Executes Preprocessor.finish() method of each added Preprocessor.
 void forceExtraction()
          Forces a new initialization phase before returning the next Iterator.
abstract  int getDataSize()
          Returns the overall data size after the extraction process is finished.
 int getDataSize(DataSource source)
          Returns the data size of the passed DataSource.
 Vector<DuDeObject> getExtractedData()
           
 int hashCode()
           
 boolean inMemoryProcessingEnabled()
          Checks, whether in-memory processing is enabled.
abstract  Iterator<DuDeObjectPair> iterator()
          Starts the extraction and preprocessing phase if necessary and returns an Iterator instance for iterating over the algorithm's result.
 
Methods inherited from class de.hpi.fgis.dude.util.AbstractCleanable
cleanUp, registerCleanable, registerCloseable
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.hpi.fgis.dude.algorithm.Algorithm
getMaximumPairCount, unregisterDataSources
 
Methods inherited from interface de.hpi.fgis.dude.util.Cleanable
cleanUp, registerCleanable, registerCloseable
 

Constructor Detail

AbstractAlgorithm

public AbstractAlgorithm()
Method Detail

addPreprocessor

public void addPreprocessor(Preprocessor preprocessor)
Description copied from interface: Algorithm
Adds a default Preprocessor to this algorithm. This Preprocessor processes the data of all DataSources.

Specified by:
addPreprocessor in interface Algorithm
Parameters:
preprocessor - The Preprocessor that shall be added. Passing null has no influence at all.

addPreprocessor

public void addPreprocessor(DataSource source,
                            Preprocessor preprocessor)
Description copied from interface: Algorithm
Adds a Preprocessor for a specific DataSource to this algorithm. Only data from the passed DataSource will be processed by this Preprocessor. Passing null instead of a Preprocessor instance has no influence at all.

Specified by:
addPreprocessor in interface Algorithm
Parameters:
source - The corresponding DataSource. If null was passed instead of a DataSource, Algorithm.addPreprocessor(Preprocessor) is called.
preprocessor - The Preprocessor that shall be added.

getExtractedData

public Vector<DuDeObject> getExtractedData()
Specified by:
getExtractedData in interface Algorithm

analyzeDuDeObject

protected void analyzeDuDeObject(DuDeObject object)
Initiates the preprocessing for the passed DuDeObject.

Parameters:
object - The DuDeObject that shall be preprocessed.

finishPreprocessing

protected void finishPreprocessing()
Executes Preprocessor.finish() method of each added Preprocessor.


enableInMemoryProcessing

public void enableInMemoryProcessing()
Description copied from interface: Algorithm
Enables in-memory processing. This property is disabled by default.

Specified by:
enableInMemoryProcessing in interface Algorithm

disableInMemoryProcessing

public void disableInMemoryProcessing()
Description copied from interface: Algorithm
Disables in-memory processing. This property is disabled by default.

Specified by:
disableInMemoryProcessing in interface Algorithm

inMemoryProcessingEnabled

public boolean inMemoryProcessingEnabled()
Description copied from interface: Algorithm
Checks, whether in-memory processing is enabled.

Specified by:
inMemoryProcessingEnabled in interface Algorithm
Returns:
true, if in-memory processing is enabled; otherwise false.

createStorage

protected DuDeStorage<DuDeObject> createStorage(String name)
                                         throws IOException
Creates the DuDeStorage instance based on the in-memory-processing flag.

Parameters:
name - The name of the storage, if a FileBasedStorage is instantiated.
Returns:
The DuDeStorage instance.
Throws:
IOException - If an IO error occurred while instantiating the storage.

finishExtraction

protected void finishExtraction()
Sets a flag which indicates that the extraction process is finished.


forceExtraction

public void forceExtraction()
Forces a new initialization phase before returning the next Iterator. This initialization includes extraction and preprocessing.


dataExtracted

protected boolean dataExtracted()
Checks whether the data extraction was already done.

Returns:
true, if the data was already extracted; otherwise false.

getDataSize

public int getDataSize(DataSource source)
Description copied from interface: Algorithm
Returns the data size of the passed DataSource.

Specified by:
getDataSize in interface Algorithm
Parameters:
source - The DataSource whose size shall be returned.
Returns:
The number of extracted DuDeObjects of the passed DataSource or 0, if the data was not extracted, yet.

getDataSize

public abstract int getDataSize()
Description copied from interface: Algorithm
Returns the overall data size after the extraction process is finished.

Specified by:
getDataSize in interface Algorithm
Returns:
The number of extracted DuDeObjects or 0, if the data was not extracted, yet.

addDataSource

public void addDataSource(DataSource source)
Description copied from interface: Algorithm
Adds a DataSource to the algorithm.

Specified by:
addDataSource in interface Algorithm
Parameters:
source - The DataSource that shall be added.

dataSourceAttached

protected abstract boolean dataSourceAttached(DataSource source)
Checks whether the passed DataSource is attached to this AbstractAlgorithm instance.

Parameters:
source - The DataSource that shall be checked.
Returns:
true, if the passed DataSource was added to this instance; false otherwise or null was passed.

addSource

protected abstract void addSource(DataSource source)
Adds the DataSource to this instance.

Parameters:
source - The DataSource that shall be added.

iterator

public abstract Iterator<DuDeObjectPair> iterator()
Starts the extraction and preprocessing phase if necessary and returns an Iterator instance for iterating over the algorithm's result.

Specified by:
iterator in interface Iterable<DuDeObjectPair>

hashCode

public int hashCode()
Overrides:
hashCode in class Object

equals

public boolean equals(Object obj)
Overrides:
equals in class Object


Copyright © 2011 Hasso Plattner Institute - Chair of Information Systems. All Rights Reserved.