|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
SimilarityFunction
implementation checks the absolute variation of the numbers of two DuDeObject
attributes.Jsonable
deserialization.
RelativeNumberDiffFunction
.
RelativeNumberDiffFunction
.
AbstractAlgorithm
implements the functionality that is needed by each algorithm type.AlgorithmIteratorWrapper
is used for setting some common properties of the generated DuDeObjectPair
s.AbstractAlgorithm.AlgorithmIteratorWrapper
with the passed Iterator
.
AbstractCleanable
is implemented by classes that collect Closeable
instances that shall be closed at the end of a process.AbstractDataSource
provides the common functionality of all DataSource
classes.Jsonable
deserialization.
AbstractDataExtractor
with the passed identifier.
AbstractDataSourceIterator
can be used to generate valid DuDeObject
s.AbstractDataSourceIterator
with for the passed AbstractDataSource
.
AbstractDuDeObjectSorter
implements the DuDeObjectSorter
interface partially.AbstractDuDeObjectSorter
with no SortingKey
.
AbstractDuDeObjectSorter
with the passed SortingKey
.
AbstractDuDeOuput
is an abstract
class which provides the common functionality of every class that implements
DuDeOutput
.DuDeOutput
with the given OutputStream
.
DuDeOutput
with the given OutputStream
.
DuDeOutput
with the given OutputStream
.
DuDeOutput
with the given File
.
DuDeOutput
with the given File
.
DuDeOutput
with the given File
.
Jsonable
deserialization.
AbstractDuDeStorage
stores Jsonable
instances.AbstractDuDeStorage.AbstractJsonableWriter
implements some functionality that shall be provided by all JsonableWriter
sub-classes.AbstractDuplicateDetection
provides the common functionality that is needed by every duplicate-detection algorithm.AbstractDuplicateDetection
instance.
AbstractIterator
is an abstract
class that should be used within all iterator-like
classes.AbstractMerger
splits the merge functionality into the merging of data,
which should be implemented in subclasses and the merging of identifiers, which is done in this class.AbstractMerger
splits the merge functionality into the merging of data,
which should be implemented in subclasses and the merging of identifiers, which is done in this class.AbstractRecordLinkage
provides the common functionality that is needed by every record-linkage algorithm.AbstractSimilarityFunction
is a skeleton implementation for providing the common functionality of a SimilarityFunction
implementation.AbstractStatisticOuput
is an abstract
class that provides functionality common to most classes implementing
StatisticOutput
.AbstractStatisticOutput
with the passed StatisticComponent
.
Jsonable
deserialization.
AbstractSubkey
is an abstract
class that should be extended by each subkey class.AbstractSubkey
instance with the given default attribute.
AdaptiveSNM_Yan2007
instance with the passed windows size.
Iterator
implementation that implements the behavior of the
Accumulatively-Adaptive Sorted-Neighborhood Method
.Iterator
implementation that implements the behavior of the
Incrementally-Adaptive Sorted-Neighborhood Method
.Iterator
implementation that is used by the different
adaptive Sorted Neighborhood methods.AdaptiveSNM_Yan2007
algorithm.NaiveTransitiveClosureGenerator
.
Collection
of pairs to the NaiveTransitiveClosureGenerator
.
WarshallClosureGenerator
.
SimilarityFunction
to this Aggregator
with no special multiplier.
SimilarityFunction
to this Aggregator
with the passed multiplier.
BibtexPerson
to this list.
JsonValue
to the end of the JsonArray
.
JsonBoolean
to the end of the JsonArray
.
JsonNumber
to the end of the JsonArray
.
JsonNumber
to the end of the JsonArray
.
JsonNumber
to the end of the JsonArray
.
JsonString
to the end of the JsonArray
.
DuDeObject
to the collection that will be sorted.
DuDeStorage
.
Iterable
to the AbstractDuDeObjectSorter
.
DuDeObject
s to this DuDeObjectSorter
.
DataSource
-related attribute to this ContentBasedSimilarityFunction
.
JsonValue
instance to an attribute of the passed JsonRecord
.
JsonArray
generated out of the passed Collection
to the end of the
JsonArray
.
DataSource
to the algorithm.
DataSource
to the algorithm.
DataSource
to this Experiment
.
DataSource
s to this Experiment
.
DuDeOutput
to this Experiment
.
DuDeOutput
s to this Experiment
.
DuDeObjectPair
to the knowledge base that is labeled as a detected duplicate and
the gold standard's duplicate pairs.
DuDeObjectPair
to the knowledge base that is labeled as a detected duplicate.
DuDeObjectPair
to the knowledge base that is labeled as a detected duplicate.
DuDeObjectPair
s to the knowledge base that are labeled as detected duplicates.
DuDeOutput
for fuzzy duplicates to this Experiment
.
DuDeOutput
s for fuzzy duplicates to this Experiment
.
DuDeObject
.
JsonRecord
generated out of the passed Map
to the end of the JsonArray
.
DuDeObjectPair
to the knowledge base that is labeled as a detected non-duplicate.
DuDeObjectPair
to the knowledge base that is labeled as a detected non-duplicate.
DuDeObjectPair
s to the knowledge base that are labeled as a detected non-duplicates.
JsonNull
to the end of the JsonArray
.
DuDeOutput
s.
DuDeObjectPair
to the knowledge base.
DuDeObjectPair
s to the knowledge base.
Preprocessor
to this algorithm.
Preprocessor
for a specific DataSource
to this algorithm.
DataSource
-related SortingKey
.
DataSource
to this instance.
StatisticOutput
instance to this Experiment
.
StatisticOutput
instances to this Experiment
.
SortingKey
.
Aggregator
aggregates the similarities returned by different SimilarityFunction
s.Jsonable
deserialization.
MultiDuDeObjectComparator
with a number of sub-comparators.
Algorithm
collects all the methods that are needed by each algorithm implementation.Algorithm
was set.
DuDeObject
.
DuDeObject
to the Preprocessor
for further analysis.
ArrayConversionStrategy
generates a one-element JsonArray
with the passed JsonAtomic
value and runs the passed
ContentBasedSimilarityFunction
on both JsonArrays
.Class
to an array of BoundType
s without parameters.
Iterator
into a List
.
BoundType
.Average
returns the average similarity of all added SimilarityFunction
s.Jsonable
deserialization.
Average
instance.
BestMatchCalculationStrategy
compares a JsonArray
with a JsonAtomic
by selecting the best match.BibTex
files.BibtexAbstractEntry
.
BibtexAbstractValue
.
BibtexConcatenatedValue
using the specified values.
BibtexEntry
.
BibTex
DOM tree and the factory for any BibTex
model - the only way
to create nodes.BibTexFile
.
BibTex
let's you define macros which are essentially just shortcuts for strings.BibtexMacroDefinition
.
BibtexMacroReference
.
BibTex
model nodes.BibtexFile
(which in turn extends
BibtexNode
).
BibTex
into a basic AST.BibTexParser
.
BibtexPerson
objects are elements of BibtexPersonList
s, which can be used in author or editor
fields.BibtexPerson
.
BibtexPerson
objects that can be used for author or editor fields - use the PersonListExpander to
convert all editor/author field values of a particular BibtexFile
to BibtexPersonLists
.BibtexPersonList
.
BibtexPreamble
.
BibtexSource
represents *.bib files containing BibTeX syntax.Jsonable
deserialization.
BibtexSource
.
BibtexSource
object.
BibtexSourceIterator
is used for generating DuDeObject
s out of BibtexSource
s.BibtexSourceIterator
using the passed BibtexSource
.
BibTexString
.
BibTex
file and not parsable as some
other entry.BibtexToplevelComment
.
BlockDistanceFunction
compares two DuDeObject
s based on the (city) Block Distance of the given attribute.Jsonable
deserialization.
BlockDistanceFunction
with the default tokenizer.
BlockDistanceFunction
with the default tokenizer.
BlockDistanceFunction
with the passed InterfaceTokeniser
.
BlockDistanceFunction
with the passed InterfaceTokeniser
.
BoundType
iff all bounds are exactly the same.BoundType
around the given raw type.
BoundType
around the given parameterized type.
AdaptiveWindowSizeSNM
instance
DuDeObject
s.
JsonValue
s.
CalculationStrategy
is an interface for different strategies, that can be used within ContentBasedSimilarityFunction
s for defining
the behavior of the similarity calculation, if at least one value is not an atomic one.CD
data source.Iterator
s to one big iterator.CitySimilarityFunction
compares two strings and treats them as cities, allowing for some special normalization and comparison techniques.Cleanable
is an interface that provides methods for easily closing a bunch of Closeable
or Cleanable
instances.Closeable
and Cleanable
instances.
is-duplicate
property.
CSVWriter
.
DataSource
.
DataSources
s.
DuDeOutput
s.
DuDeOutput
s.
StatisticOutput
s.
JsonArray
.
ColumnInfo
represents a column with its name and type.ColumnInfo
object with the name and the Types
data type representation of the column.
DuDeObject
s.
JsonAtomic
s.
JsonValue
.
DuDeJsonParser.createPushBackGenerator()
.ConstantSimilarityFunction
returns a similarity that is independent from the passed DuDeObjectPair
and can be specified by the
user.Jsonable
deserialization.
ConstantSimilarityFunction
with the passed similarity.
DuDeObject
is already member of the transitive closure.
ContentBasedSimilarityFunction
is a skeleton implementation with common functionality that is used by any content-based
SimilarityFunction
.Jsonable
deserialization.
ContentBasedSimilarityFunction
with the passed default attribute.
ContentBasedSimilarityFunction
with the passed default attribute.
CORA
data source.CosineSimilarityFunction
compares two DuDeObject
s based on the cosine similarity of the given attribute.Jsonable
deserialization.
CosineSimilarityFunction
with the default tokenizer.
CosineSimilarityFunction
with the default tokenizer.
CosineSimilarityFunction
with the passed InterfaceTokeniser
.
CosineSimilarityFunction
with the passed InterfaceTokeniser
.
CountPreprocessor
is a sample class, that shows how the Preprocessor
interface can be
used.CountPreprocessor
.
Connection
object, which represents a new connection to the database.
Iterator
instance.
Iterator
instance.
GSwooshIterator
.
RSwooshIterator
.
SortedBlocks
instance using variable block sizes.
JsonArray
instance based on the passed Json code.
JsonNumber
instance based on the passed Json code.
JsonParseException
.
JsonParseException
.
JsonRecord
instance based on the passed Json code.
JsonString
instance based on the passed Json code.
PushbackInputStream
.DuDeJsonGenerator
might be used to generate arbitrary json elements in front of the current parser position in a FIFO
manner.DuDeJsonGenerator.close()
is necessary.
DuDeStorage
instance based on the in-memory-processing flag.
XMLStreamReader
.
CrossProductStrategy
compares a each member of the first JsonArray
with all elements of the second JsonArray
.DudeObjectPairs
, their similarity value and selected optional value in a CSV file row by row.CSVOutput
.
CSVOutput
with the passed OutputStream
.
CSVOutput
.
Jsonable
deserialization.
CSVReader
reads CSV
formatted data.CSVReader
with the given Reader
.
CSVSource
represents *.csv files.Jsonable
deserialization.
CSVSource
.
CSVSource
with column names.
CSVSourceIterator
is used for generating DuDeObject
s out of CSVSource
s.CSVSourceIterator
using the passed CSVSource
.
CSVStatisticOutput
writes the statistics provided by a StatisticComponent
instance into a CSV
file.CSVStatisticOutput
with no statistics.
CSVStatisticOutput
.
CSVStatisticOutput
with no statistics.
CSVStatisticOutput
.
Jsonable
deserialization.
CSVWriter
generates CSV
formatted data.CSVWriter
with the given OutputStream
.
CSVWriter
with the given Writer
.
DuDeJsonParser.nextToken()
call.
Database
is an abstract class that encapsulates the database related information.dbInfo
.
InputStream
.
Properties
.
DatabaseSource
represents databases.Jsonable
deserialization.
DatabaseSource
for the passed Database
and table.
DatabaseSourceIterator
is used for generating DuDeObject
s out of DatabaseSource
s.DatabaseSourceIterator
using the passed DatabaseSource
.
DataSource
.
DataSource
is used for extracting data out of different data sources.DataSource
is attached to this AbstractAlgorithm
instance.
IdentifierManager
manages the DataSource
identifiers.DataSource
is added.
DateSimilarityFunction
compares two strings and treats them as dates, allowing for some special normalization and comparison techniques.DB2Database
encapsulates all the necessary information for establishing a connection to a DB2 database.DB2Database
instance members and loads the settings provided by the parameter dbInfo
.
DB2Database
using the passed InputStream
.
DB2Database
using the passed Properties
.
DBInfo
encapsulates the settings which are needed for establishing a database connection.DBInfo
instance with no initial information.
DBInfo
instance with the passed properties.
DBInfo
instance where the initial properties are read from the passed InputStream
.
DBInfo
instance with the properties provided by the properties file whose path was passed.
DuplicateCountSNM
, if
the algorithm has not been notified of the comparison result of the
latest pair.NoSuchElementException.NoSuchElementException()
.
NoSuchElementException.NoSuchElementException(String)
.
NoSuchElementException.NoSuchElementException()
and stores
the passed cause.
NoSuchElementException.NoSuchElementException(String)
and
stores the passed cause.
DuDe
tool implementation.DuDe
.classes
that encapsulate the database abstraction layer.classes
that are needed for encapsulating the database schema within the database abstraction layer.DataSource
s supported by DuDe
.DuDe
.AbstractDuplicateDetection
implementations.AbstractRecordLinkage
implementations.interfaces
and classes
that can be used for printing results.classes
that can be used for printing statistics of a run that are provided by the StatisticComponent
.interfaces
and classes
dealing with the filtering of duplicate lists.Preprocessor
is a component for manipulating DuDeObject
s and gathering statistics while extracting the data.interfaces
and classes
for comparing DuDeObjectPair
s.SimilarityFunction
s.SimilarityFunction
component.SimilarityFunction
implementations that compare attribute values of DuDeObject
s.SimilarityFunction
implementations that use the Simmetrics
library.ContentBasedSimilarityFunction
.SimilarityFunction
implementations that compare the structure of DuDeObject
s.DuDe
's utility classes
.BibTeX
parser package.BibTeX
files.BibTeX
expander package contains all classes that are used for extending the functionality of the BibTeX
parser.classes
that are necessary for parsing the a BibTeX
file.classes
necessary for parsing and generating CSV
-formatted data.classes
dealing with the actual data storage (in-memory or file-based).classes
that deal with accessing Json code within files.classes
that are needed for creating a sorting key.JsonOutput
.
JsonOutput
.
DuDeObjects
, if no separator string is passed.
DefaultMerger
implements merge functionality.DefaultMerger
implements merge functionality.DiceCoefficientFunction
compares two DuDeObject
s based on the Dice's Coefficient of the given attribute.Jsonable
deserialization.
DiceCoefficientFunction
with the default tokenizer.
DiceCoefficientFunction
with the default tokenizer.
DiceCoefficientFunction
with the passed InterfaceTokeniser
.
DiceCoefficientFunction
with the passed InterfaceTokeniser
.
DocumentFrequencyPreprocessor
collects frequencies of terms within an attribute value.DocumentFrequencyPreprocessor
object for the passed attribute.
DuDeJsonGenerator
is another implementation for generating Json code.DuDeJsonGenerator
.
DuDeJsonGenerator
.
DuDeJsonGenerator
with the given generator.
DuDeJsonParser
can be used for converting a String containing Json syntax into its Java representation.DuDeJsonParser
using the passed InputStream
.
DuDeJsonParser
using the passed Reader
.
DuDeJsonParser
that parses the passed String
.
DuDeJsonParser
using the passed InputStream
.
DuDeJsonParser
using the passed Reader
.
DuDeJsonParser
that parses the passed String
.
DuDeObject
encapsulates the data of the original object and two ids (for the source and a local one) for identifying each
DuDeObject
.DuDeObject
with the passed ids and the given data.
DuDeObject
reference.
DuDeObject
using the given ids.
DuDeObject
reference.
DuDeObjectId
encapsulates the identifying information of each DuDeObject
.DuDeObjectId
.
DuDeObjectId
with the passed identifiers.
DuDeObjectPair
is an extension of the OrderedPair
class,
that encapsulates pairs of DuDeObject
s.DuDeObject
s.
DuDeObjectPair
that contains no real data.
DuDeObjectPair
that contains no real data and
where the object id is given as a single String
value.
DuDeObjectPair.DuplicateType
declares all possible values
for the is-duplicate
property.DuDeObjectPair.GeneratedBy
declares the possible values for
the lineage
property.interface
DuDeObjectSorter
provides the method signatures for sorting a collection of DuDeObject
s.DuDeObjectSource
is an in-memory DataSource
.DuDeObjectSource
with the passed identifier and DuDeStorage
instance.
DuDeObjectSource
with the passed identifier and collection.
DuDeOutput
is an interface for writing DuDeObjectPair
s onto an stream.DuDeStorage
stores Jsonable
instances.AdaptiveWindowSizeSNM
implements the Adaptive-Window-Size Sorted-Neighborhood Method that was introduced by Oliver Wonneberg.AdaptiveWindowSizeSNM.AdaptiveWindowSizeSNMBuilder
maintains the adaptable window size of the
AdaptiveWindowSizeSNM
.AdaptiveWindowSizeSNM.AdaptiveWindowSizeSNMBuilder
with a SortingKey
and the mode that shall be used.
AdaptiveWindowSizeSNMIterator
implements the behavior of the Adaptive-Window-Size SNM algorithm.AdaptiveWindowSizeSNMIterator
.
DuDeObjectPair
can either yield a DUPLICATE or a NON_DUPLICATEDuplicateCountSNM
.WarshallTransitiveClosureGenerator.AdjacencyList
to represent the graph.
WarshallTransitiveClosureGenerator.AdjacencyMatrix
to represent the graph.
Database
instances have the same information stored.
DBInfo
instances have the same information stored.
JsonRecord.JsonNull
object is equal to the null value and to itself.
EquationSimilarityFunction
checks if two values are equal to each other.Jsonable
deserialization.
EquationSimilarityFunction
with the passed default attribute.
EquationSimilarityFunction
with the passed default attribute.
EuclideanDistanceFunction
compares two DuDeObject
s based on the Euclidean Distance of the given attribute.Jsonable
deserialization.
EuclideanDistanceFunction
with the default tokenizer.
EuclideanDistanceFunction
with the default tokenizer.
EuclideanDistanceFunction
with the passed InterfaceTokeniser
.
EuclideanDistanceFunction
with the passed InterfaceTokeniser
.
Experiment
is a Wrapper for hiding the actual process of checking each pair of records.Experiment
.
Experiment
class.ExtendedStatisticComponent
provides functionality for gathering statistics concerning different
measures that can be realized with the Generalized Merge Distance (GMD).ExtendedStatisticComponent
with no gold standard and
default configuration for GMD.
ExtendedStatisticComponent
using the passed DuDeObjectPair
s as real duplicates.
Jsonable
deserialization.
DataSource
implementation, if
an object can't be extracted.NoSuchElementException.NoSuchElementException()
.
NoSuchElementException.NoSuchElementException(String)
.
NoSuchElementException.NoSuchElementException()
and stores
the passed cause.
NoSuchElementException.NoSuchElementException(String)
and
stores the passed cause.
false
.
FamilyNameSimilarityFunction
compares two strings and treats them as family names, allowing for some special normalization and comparison techniques.FileBasedStorage
stores Jsonable
instances in files.FileBasedStorage
instance with the passed name.
FileBasedStorage
instance with the passed type information and a name.
FileBasedStorage
instance with the passed name, and the initial content.
FileBasedStorage
instance with the passed type information, its name, and the initial content.
FileBasedStorage
instance with the type information, a directory and its name.
FileBasedStorage
instance with the type information, a directory and its name.
FileBasedStorage
instance with the type information, a directory, its name, and the initial content.
FileBasedStorage
instance with the type information, a directory, its name, and the initial content.
FileBasedStorage
instance with the type information, the underlying file, and the initial content.
FileBasedStorage
instance with the type information, the underlying file, and the initial content.
FilenameManager
manages all filenames within a directory.FilenameManager
for the passed directory.
Preprocessor.finish()
method of each added Preprocessor
.
#readSerializedType(DuDeJsonParser, boolean)
.
Iterator
.
Iterable
s.
Jsonable
type.
DuDeJsonParser
.
JsonArray
.
JsonRecord
.
Algorithm
.
DuDeObject
.
DuDeObject
.
DuDeObject
for the given attribute.
null
, if the passed path is invalid.
null
, if no column names were set.
StatisticComponent
into a Map
.
JsonableReader
that can be used to return the extracted data of the passed DataSource
.
DataSource
s and their extracted data.
Database
.
Driver
's name used for loading the Driver class.
FileBasedStorage
that is encapsulated in this instance.
DataSource
.
DataSource
s.
JsonOutput.DEFAULT_FOOTER
.
JsonOutput.DEFAULT_HEADER
.
Algorithm
-Object).
false negatives
count.
false negatives
count that are explicitly classified by the comparator.
false positives
count.
false positives
count that are explicitly classified by the comparator..
DuDeObject
.
0
if no position is set.
DataSource
.
DuDeObject
.
DuDeObject
.
DuDeObject
that will be stored in memory, if file-based processing is enabled.
GlobalConfig
.
JsonRecord
that contains currentAttributeName
.
JsonValue
specified by the passed path or null
if the specified attribute does not exist.
DuDeObjects.
- getKeyString(DuDeObject) -
Method in class de.hpi.fgis.dude.util.sorting.sortingkey.SortingKey
- Returns the sorting key value of the passed
DuDeObject
.
- getKeyString(DuDeObject, String) -
Method in class de.hpi.fgis.dude.util.sorting.sortingkey.SortingKey
- Returns the sorting key value of the passed
DuDeObject
.
- getKeyValue(DuDeObject) -
Method in class de.hpi.fgis.dude.util.sorting.sortingkey.SortingKey
- Returns the sorting key value of the passed
DuDeObject
.
- getLabel() -
Method in enum de.hpi.fgis.dude.postprocessor.ExtendedStatisticComponent.Config
- Returns the label of the configuration for the output components
- getLabels() -
Method in class de.hpi.fgis.dude.output.statisticoutput.AbstractStatisticOutput
-
- getLabels() -
Method in interface de.hpi.fgis.dude.output.statisticoutput.StatisticOutput
- Returns the labels for the measurements.
- getLast() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexPerson
- Returns the last name of the person.
- getLastValidationState() -
Method in class de.hpi.fgis.dude.similarityfunction.AbstractSimilarityFunction
-
- getLastValidationState() -
Method in interface de.hpi.fgis.dude.similarityfunction.SimilarityFunction
- Returns the validation state of the last
SimilarityFunction.getSimilarity(DuDeObjectPair)
call.
- getLastValidationState() -
Method in class de.hpi.fgis.dude.similarityfunction.structurebased.ConstantSimilarityFunction
- Since
ConstantSimilarityFunction
is not based on actual values, it returns SimilarityValidationState.BothValid
for each
calculated pair.
- getLeft() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexConcatenatedValue
- Returns the left value of this concatenation.
- getLine() -
Method in class de.hpi.fgis.dude.util.bibtex.parser.LookAheadReader
- Returns the current line.
- getLineage() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexPerson
- Returns the lineage of this person.
- getLineage() -
Method in class de.hpi.fgis.dude.util.data.DuDeObjectPair
- Returns the lineage value of the current pair.
- getList() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexPersonList
- Returns a read-only list whose members are instances of
BibtexPerson
.
- getLongDescriptionString() -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.MongeElkanSimilarity
- returns the long string identifier for the metric.
- getLongDescriptionString() -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SmithWatermanDistance
- returns the long string identifier for the metric.
- getLowerThreshold() -
Method in class de.hpi.fgis.dude.util.Experiment
- Gets the lower threshold for this experiment.
- getMaxAllowedVariation(double, double) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.AbsoluteNumberDiffFunction
- Just returns the
AbsoluteNumberDiffFunction.maxAbsoluteVariation
set in constructor.
- getMaxAllowedVariation(double, double) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.RelativeNumberDiffFunction
- Gets the maximum allowed variation based on the
RelativeNumberDiffFunction.maxToleranceFactor
.
- getMaxBlockSize() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.SortedBlocks
- Returns the maximum block size.
- getMaximumMemory() -
Method in class de.hpi.fgis.dude.util.MemoryChecker
- Returns the maximum amount of memory that can be used.
- getMaximumMemoryUsage() -
Method in class de.hpi.fgis.dude.util.GlobalConfig
- Returns the maximum relative memory used during the sorting phase.
- getMaximumMemoryUsage() -
Method in class de.hpi.fgis.dude.util.MemoryChecker
- Returns the maximum memory usage in percent.
- getMaximumMemoryUsed() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Gets the registered maximum amount of memory during the experiment.
- getMaximumPairCount() -
Method in class de.hpi.fgis.dude.algorithm.AbstractDuplicateDetection
-
- getMaximumPairCount() -
Method in class de.hpi.fgis.dude.algorithm.AbstractRecordLinkage
-
- getMaximumPairCount() -
Method in interface de.hpi.fgis.dude.algorithm.Algorithm
- Returns the number of pairs, that would be generated by the naive algorithm of the current instance's algorithm type based on the extracted
data size.
- getMaxMemoryUsed() -
Method in class de.hpi.fgis.dude.util.MemoryCheckerTask
- Gets the maximum amount of memory used by the current application.
- getMemoryCheckFrequency() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Gets the frequency of memory checks.
- getMergeCount() -
Method in class de.hpi.fgis.dude.util.data.DuDeObject
- Returns the number of objects this
DuDeObject
was merged from.
- getMerger() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.Lego
- Returns the merger that merges several DuDeObjects into one
- getMinimumMemoryUsed() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Gets the registered minimum amount of memory during the experiment.
- getMinMemoryUsed() -
Method in class de.hpi.fgis.dude.util.MemoryCheckerTask
- Gets the minimum amount of memory used by the current application.
- getName() -
Method in class de.hpi.fgis.dude.database.util.ColumnInfo
- Returns the column's name.
- getNextBlock() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007.YanIterator
- Calculates the elements of the next block.
- getNextRecord() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007.YanIterator
- Returns the next object from the record queue.
- getNextTemporaryFilename() -
Method in class de.hpi.fgis.dude.util.sorting.sorter.TwoPhaseMultiWayMergeSorter
- Returns the name of the next temporary
DuDeObjectFile
.
- getNextValidFilename(String) -
Method in class de.hpi.fgis.dude.util.FilenameManager
- Returns a valid alternative for the passed filename.
- getNotification() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.DuplicateCountSNM
- Returns the category that was set for the last processed pair.
- getNotification() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.Lego
- Returns the category that was set for the last processed pair.
- getNrCharForBlocking() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.NaiveBlockingAlgorithm
- Returns the number of characters of the sorting key that are used as blocking criterion.
- getNumBaseRecords() -
Method in enum de.hpi.fgis.dude.postprocessor.ExtendedStatisticComponent.Config
- Getter for total number of base records, needed for computation of VI.
- getNumberAssignedRecords() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007
- Returns the sum of records that are already assigned to a block.
- getNumberCreatedBlocks() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007
- Returns the number of created blocks
- getNumberOfCandidateComparisons() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the maximum number of pairs that would be generated by the naive approach.
- getNumberOfRealDuplicates() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the size of the gold standard.
- getNumRecordsOfBlock() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007.AA_SNM_Iterator
-
- getNumRecordsOfBlock() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007.IA_SNM_Iterator
-
- getNumRecordsOfBlock() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007.YanIterator
- Calculates the number of records within the next block.
- getObjectCount() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the number of records that were processed by the algorithm.
- getObjectCount() -
Method in class de.hpi.fgis.dude.preprocessor.CountPreprocessor
- Returns the number of objects that were extracted during the data extraction phase.
- getObjectId() -
Method in class de.hpi.fgis.dude.util.data.DuDeObject
- Returns the object identifier of this object.
- getObjectId() -
Method in class de.hpi.fgis.dude.util.data.DuDeObjectId
- Returns the object identifier.
- getOptionalEntries() -
Method in class de.hpi.fgis.dude.output.statisticoutput.AbstractStatisticOutput
- Returns all extension columns' label and value.
- getOverlapSize() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.SortedBlocks
- Returns the current overlap size.
- getOwnerFile() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexNode
- Returns the owner file of this node.
- getPairCount() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the number of pairs that were already considered.
- getParameterizedType() -
Method in class de.hpi.fgis.dude.util.BoundType
- Returns the wrapped parameterized type or null if this
BoundType
was not created around a ParameterizedType
.
- getParameters() -
Method in class de.hpi.fgis.dude.util.BoundType
- Returns the bound types or an empty array if none exists.
- getPassword() -
Method in class de.hpi.fgis.dude.database.util.DBInfo
- Returns the password which is used for establishing the current database connection.
- getPort() -
Method in class de.hpi.fgis.dude.database.adapter.Database
- Returns the port of the underlying database system.
- getPort() -
Method in class de.hpi.fgis.dude.database.util.DBInfo
- Returns the port of the currently used database system.
- getPositions() -
Method in class de.hpi.fgis.dude.util.sorting.sortingkey.TextBasedSubkey
- Returns an iterable instance that stores all specified character positions.
- getPrecision() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the precision based on the current knowledge base.
- getPrecisionByComparison() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the precision based on the current knowledge base and the actual comparisons.
- getPreLast() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexPerson
- Returns the middle name or any middle initials of this persons.
- getPrimitive(Class<T>) -
Method in class de.hpi.fgis.dude.util.data.json.auto.JsonTypeManager
- Returns the primitive for type T.
- getPrimitiveFields() -
Method in class de.hpi.fgis.dude.util.data.json.auto.CompositeJsonSerialization
- Returns the primitive fields of the wrapped type.
- getProperty(String) -
Method in class de.hpi.fgis.dude.util.OrderedPair
- Returns the value of the passed property.
- getQuery() -
Method in class de.hpi.fgis.dude.database.DatabaseSource
- Returns the complete query that is used for querying the result.
- getQuoteCharacter() -
Method in class de.hpi.fgis.dude.datasource.CSVSource
- Returns the set quote character.
- getQuoteCharacter() -
Method in class de.hpi.fgis.dude.output.CSVOutput
- Returns the quote character.
- getQuoteCharacter() -
Method in class de.hpi.fgis.dude.util.csv.CSVReader
- Returns the current quote character.
- getQuoteCharacter() -
Method in class de.hpi.fgis.dude.util.csv.CSVWriter
- Returns the current quote character.
- getRawType() -
Method in class de.hpi.fgis.dude.util.data.json.auto.AutoJsonSerialization
- Returns the raw type for which this serialization class was created.
- getReader() -
Method in class de.hpi.fgis.dude.util.data.storage.FileBasedStorage
-
- getReader() -
Method in class de.hpi.fgis.dude.util.data.storage.InMemoryStorage
-
- getReader() -
Method in class de.hpi.fgis.dude.util.data.storage.InputStreamReadable
-
- getReader() -
Method in interface de.hpi.fgis.dude.util.data.storage.JsonReadable
- Returns the
JsonableReader
that can be used to access the content of this DuDeStorage
.
- getRecall() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the recall based on the current knowledge base.
- getRecallByComparison() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the recall based on the current knowledge base and the actual comparisons.
- getReductionRatio() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the reduction ratio based on the current knowledge base.
- getReductionRatioByComparison() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the reduction ratio based on the current knowledge base and the actual comparisons.
- getReference() -
Method in class de.hpi.fgis.dude.util.data.DuDeObjectPair
- Returns a
DuDeObjectPair
that refers to the current pair.
- getRight() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexConcatenatedValue
- Returns the right value of this concatenation.
- getRootElementTag() -
Method in class de.hpi.fgis.dude.datasource.XMLSource
- Returns the set root element or
null
, if no root was set.
- getRuntime() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Gets the time difference between beginning time and finishing time.
- getScalarValue(String) -
Static method in class de.hpi.fgis.dude.util.data.json.DuDeJsonParser
- Converts the passed String into the corresponding atomic
JsonValue
.
- getSecond() -
Method in class de.hpi.fgis.dude.postprocessor.WarshallTransitiveClosureGenerator.IntPair
- Returns the second integer value of the integer pair.
- getSecondElement() -
Method in class de.hpi.fgis.dude.util.Pair
- Returns the second element.
- getSecondElementObjectData() -
Method in class de.hpi.fgis.dude.util.data.DuDeObjectPair
- Returns the data of the second
DuDeObject
.
- getSecondElementsObjectIdAttributes() -
Method in class de.hpi.fgis.dude.util.GoldStandard
- Returns the names of the attributes that store the object id of the pair's second element.
- getSecondElementsSourceId() -
Method in class de.hpi.fgis.dude.util.GoldStandard
- Returns the source id of the pair's second element.
- getSeparator() -
Method in class de.hpi.fgis.dude.output.CSVOutput
- Returns the separator character.
- getSeparator() -
Method in class de.hpi.fgis.dude.util.csv.CSVReader
- Returns the current separator character.
- getSeparator() -
Method in class de.hpi.fgis.dude.util.csv.CSVWriter
- Returns the current separator character.
- getSeparatorCharacter() -
Method in class de.hpi.fgis.dude.datasource.CSVSource
- Returns the set separator character.
- getShortDescriptionString() -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.MongeElkanSimilarity
- returns the string identifier for the metric.
- getShortDescriptionString() -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SmithWatermanDistance
- returns the string identifier for the metric .
- getSimilarity(DuDeObjectPair) -
Method in class de.hpi.fgis.dude.similarityfunction.AbstractSimilarityFunction
-
- getSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.EquationSimilarityFunction
-
- getSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.simmetrics.SimmetricsFunction
-
- getSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.SoundExFunction
-
- getSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.TFIDFSimilarityFunction
-
- getSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.MongeElkanSimilarity
- gets the similarity of the two strings using Monge Elkan.
- getSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SmithWatermanDistance
- gets the similarity of the two strings using Needleman Wunch distance.
- getSimilarity(DuDeObjectPair) -
Method in interface de.hpi.fgis.dude.similarityfunction.SimilarityFunction
- Calculates the similarity of passed
DuDeObjectPair
's members.
- getSimilarity(String, String) -
Method in interface de.hpi.fgis.dude.similarityfunction.StringSimilarity
- Returns the similarity of the passed Strings, where
0.0
means that Strings are completely different, and 1.0
indicates that the passed Strings are the same.
- getSimilarity(DuDeObjectPair) -
Method in class de.hpi.fgis.dude.similarityfunction.structurebased.ConstantSimilarityFunction
-
- getSimilarity() -
Method in class de.hpi.fgis.dude.util.data.DuDeObjectPair
- Returns the similarity of the
DuDeObjectPair
or
DuDeObjectPair.NO_SIMILARITY_SET_VALUE
, if the similarity wasn't set.
- getSimilarityExplained(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.MongeElkanSimilarity
- gets a div class xhtml similarity explaining the operation of the metric.
- getSimilarityExplained(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SmithWatermanDistance
- gets a div class xhtml similarity explaining the operation of the metric.
- getSimilarityFunction() -
Method in class de.hpi.fgis.dude.util.Experiment
- Returns the
SimilarityFunction
.
- getSimilarityFunctionForClassAsString(String) -
Static method in class de.hpi.fgis.dude.similarityfunction.domainspecific.address.misc.FunctionSelector
- returns the class object of the similarity function represented by the provided string
- getSimilarityTimingEstimated(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.MongeElkanSimilarity
- gets the estimated time in milliseconds it takes to perform a similarity timing.
- getSimilarityTimingEstimated(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SmithWatermanDistance
- gets the estimated time in milliseconds it takes to perform a similarity timing.
- getSize() -
Method in class de.hpi.fgis.dude.postprocessor.WarshallTransitiveClosureGenerator.AdjacencyList
-
- getSize() -
Method in class de.hpi.fgis.dude.postprocessor.WarshallTransitiveClosureGenerator.AdjacencyMatrix
-
- getSize() -
Method in class de.hpi.fgis.dude.postprocessor.WarshallTransitiveClosureGenerator.GraphRepresentation
- Returns the number of elements in the matrix.
- getSortedCollection() -
Method in class de.hpi.fgis.dude.util.sorting.sorter.AbstractDuDeObjectSorter
- Returns the sorted data.
- getSortedCollection() -
Method in interface de.hpi.fgis.dude.util.sorting.sorter.DuDeObjectSorter
- Returns the sorted data.
- getSortedCollection() -
Method in class de.hpi.fgis.dude.util.sorting.sorter.InMemorySorter
-
- getSortedCollection() -
Method in class de.hpi.fgis.dude.util.sorting.sorter.TwoPhaseMultiWayMergeSorter
-
- getSortedDataFilename() -
Method in class de.hpi.fgis.dude.util.sorting.sorter.TwoPhaseMultiWayMergeSorter
- Returns the name of the
DuDeObjectFile
containing the sorted data.
- getSortingKey() -
Method in class de.hpi.fgis.dude.algorithm.SortingDuplicateDetection
- Returns the set
SortingKey
.
- getSortingKey() -
Method in class de.hpi.fgis.dude.util.sorting.sorter.AbstractDuDeObjectSorter
- Returns the
SortingKey
that defines the sorting order.
- getSortingKeyComparisons() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007
- Returns the number of distance comparisons of two sorting key values.
- getSortingKeys() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.Lego
- Return the blocking criteria
- getSoundEx(String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SoundEx
- Generates the
SoundEx
value of the passed String.
- getSourceId() -
Method in class de.hpi.fgis.dude.util.data.DuDeObject
- Returns the source identifier of this object.
- getSourceId() -
Method in class de.hpi.fgis.dude.util.data.DuDeObjectId
- Returns the source identifier.
- getSplitToken() -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.impl.TFIDFSimilarityFunction
- Returns the split token.
- getSQLSchema() -
Method in class de.hpi.fgis.dude.database.adapter.Database
- Returns the schema, which is used by this database connection.
- getSQLSchema() -
Method in class de.hpi.fgis.dude.database.util.DBInfo
- Returns the sqlSchema which is used in the current database connection.
- getSQLType() -
Method in class de.hpi.fgis.dude.database.util.ColumnInfo
- Returns the
Types
datatype of the column.
- getStartDate() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Gets the date of the specified start time of an algorithm.
- getStaticBoundTypes(ParameterizedType) -
Static method in class de.hpi.fgis.dude.util.ReflectUtil
- Returns the static bounds for the given type.
- getStaticBoundTypes(Class<?>) -
Static method in class de.hpi.fgis.dude.util.ReflectUtil
- Returns the static bounds for the given type.
- getStaticBoundTypes(Field) -
Static method in class de.hpi.fgis.dude.util.ReflectUtil
- Returns the static bounds for the given field.
- getStatisticOutputs() -
Method in class de.hpi.fgis.dude.util.Experiment
- Returns the added
StatisticOutput
s.
- getStatistics() -
Method in class de.hpi.fgis.dude.output.statisticoutput.AbstractStatisticOutput
-
- getStatistics() -
Method in interface de.hpi.fgis.dude.output.statisticoutput.StatisticOutput
- Returns the current statistic component that is used by the output.
- getString(Object) -
Static method in class de.hpi.fgis.dude.output.CSVOutput
- Returns the String representation of the object or
null
, if null
was passed.
- getStringValue() -
Method in interface de.hpi.fgis.dude.util.data.json.JsonAtomic
- Returns the actual value converted into a String.
- getStringValue() -
Method in class de.hpi.fgis.dude.util.data.json.JsonBoolean
-
- getStringValue() -
Method in class de.hpi.fgis.dude.util.data.json.JsonNull
-
- getStringValue() -
Method in class de.hpi.fgis.dude.util.data.json.JsonNumber
-
- getStringValue() -
Method in class de.hpi.fgis.dude.util.data.json.JsonString
-
- getSubkeyValue(DuDeObject) -
Method in class de.hpi.fgis.dude.util.sorting.sortingkey.AbstractSubkey
-
- getSubkeyValue(DuDeObject) -
Method in interface de.hpi.fgis.dude.util.sorting.sortingkey.Subkey
- Returns a
JsonArray
that collects all relevant values for the subkey of the passed DuDeObject
.
- getSuperTypeInfo() -
Method in class de.hpi.fgis.dude.util.data.json.auto.CompositeJsonSerialization
- Returns the super type json serialization.
- getTableName() -
Method in class de.hpi.fgis.dude.database.DatabaseSource
- Returns the table name.
- getThreshold() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.AdaptiveSNM_Yan2007
- Returns the sorting key.
- getThreshold() -
Method in class de.hpi.fgis.dude.util.Experiment
- Gets the threshold for this experiment.
- getTransitiveClosures() -
Method in class de.hpi.fgis.dude.postprocessor.NaiveTransitiveClosureGenerator
- Returns the transitive closures as a 2-dimensional collection.
- getTrueNegatives() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the
true negatives
count.
- getTrueNegativesByComparison() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the
true negatives
count that are explicitly classified by the comparator.
- getTruePositives() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the
true positives
count.
- getTruePositivesByComparison() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Returns the
true positives
count that are explicitly classified by the comparator..
- getType() -
Method in class de.hpi.fgis.dude.util.BoundType
- Returns the raw type.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.auto.AutoJsonSerialization
- Returns the
BoundType
for which this serialization class was created.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.auto.Primitive
- Returns the type of the primitive
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.JsonArray
- Returns
JsonType.Array
.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.JsonBoolean
- Returns
JsonType.Boolean
.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.JsonNull
- Returns
JsonType.Null
.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.JsonNumber
- Returns
JsonType.Number
.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.JsonRecord
- Returns
JsonType.Record
.
- getType() -
Method in class de.hpi.fgis.dude.util.data.json.JsonString
- Returns
JsonType.String
.
- getType() -
Method in interface de.hpi.fgis.dude.util.data.json.JsonValue
- Returns the type of the current instance.
- getTypeInfo(BoundType) -
Method in class de.hpi.fgis.dude.util.data.json.auto.JsonTypeManager
- Returns the
AutoJsonSerialization
for the given BoundType
.
- getTypeInfo(Class<T>) -
Method in class de.hpi.fgis.dude.util.data.json.auto.JsonTypeManager
- Returns the
AutoJsonSerialization
for the given class.
- getTypeInfo(Class<T>, Type) -
Method in class de.hpi.fgis.dude.util.data.json.auto.JsonTypeManager
- Returns the
AutoJsonSerialization
for the given class and the type parameters of the declarations if existent.
- getTypeInfo(ParameterizedType) -
Method in class de.hpi.fgis.dude.util.data.json.auto.JsonTypeManager
- Returns the
AutoJsonSerialization
for the given type.
- getTypeInfo(Type) -
Method in class de.hpi.fgis.dude.util.data.json.auto.JsonTypeManager
- Returns the
AutoJsonSerialization
for the given type.
- getUnNormalisedSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.MongeElkanSimilarity
- gets the un-normalised similarity measure of the metric for the given strings.
- getUnNormalisedSimilarity(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.SmithWatermanDistance
- implements the Smith-Waterman distance function //see http://www.gen.tcd.ie/molevol/nwswat.html for details .
- getUpperThreshold() -
Method in class de.hpi.fgis.dude.util.Experiment
- Gets the thresholds for this experiment.
- getUsedVMMemory() -
Static method in class de.hpi.fgis.dude.util.MemoryChecker
- Returns the amount of already used memory.
- getUser() -
Method in class de.hpi.fgis.dude.database.util.DBInfo
- Returns the user name which is used for establishing the current database connection.
- getValue(String, String) -
Method in class de.hpi.fgis.dude.similarityfunction.contentbased.util.LevenshteinDistance
- Returns the
Levenshtein Distance
of the passed Strings.
- getValue() -
Method in class de.hpi.fgis.dude.util.bibtex.data.BibtexMacroDefinition
- Returns the value of the macro definition.
- getValue() -
Method in class de.hpi.fgis.dude.util.data.json.JsonBoolean
- Returns the actual value of this
JsonBoolean
.
- getValue() -
Method in class de.hpi.fgis.dude.util.data.json.JsonNumber
- Returns the actual value.
- getWindowSize() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.DuplicateCountSNM
- Returns the current window size.
- getWindowSize() -
Method in class de.hpi.fgis.dude.algorithm.duplicatedetection.SortedNeighborhoodMethod
- Returns the window size of this instance.
- getWorkingDirectory() -
Method in class de.hpi.fgis.dude.util.GlobalConfig
- Returns the working directory path.
- getWriter() -
Method in class de.hpi.fgis.dude.util.data.storage.FileBasedStorage
-
- getWriter() -
Method in class de.hpi.fgis.dude.util.data.storage.InMemoryStorage
-
- getWriter() -
Method in interface de.hpi.fgis.dude.util.data.storage.JsonWritable
- Returns the
JsonableWriter
that can be used to add instances to this DuDeStorage
.
- GivenNameSimilarityFunction - Class in de.hpi.fgis.dude.similarityfunction.domainspecific.address
GivenNameSimilarityFunction
compares two strings and treats them as given names, allowing for some special normalization and comparison techniques.- GivenNameSimilarityFunction(String...) -
Constructor for class de.hpi.fgis.dude.similarityfunction.domainspecific.address.GivenNameSimilarityFunction
-
- GlobalConfig - Class in de.hpi.fgis.dude.util
GlobalConfig
manages the configuration parameters of DuDe
.- GMDEvaluationExec_CORA - Class in de.hpi.fgis.dude.exec.duplicatedetection
GMDEvaluationExec_CORA
is an example experiment for the usage of the ExtendedStatisticComponent
.- GMDEvaluationExec_CORA() -
Constructor for class de.hpi.fgis.dude.exec.duplicatedetection.GMDEvaluationExec_CORA
-
- GMDEvaluationExec_Restaurant - Class in de.hpi.fgis.dude.exec.duplicatedetection
GMDEvaluationExec_Restaurant
is an example experiment for the usage of the ExtendedStatisticComponent
.- GMDEvaluationExec_Restaurant() -
Constructor for class de.hpi.fgis.dude.exec.duplicatedetection.GMDEvaluationExec_Restaurant
-
- goldStandard -
Variable in class de.hpi.fgis.dude.postprocessor.StatisticComponent
-
- GoldStandard - Class in de.hpi.fgis.dude.util
GoldStandard
implements the functionality for extracting the gold standard out of a given DataSource
.- GoldStandard() -
Constructor for class de.hpi.fgis.dude.util.GoldStandard
- Internal constructor for
Jsonable
deserialization.
- GoldStandard(DataSource) -
Constructor for class de.hpi.fgis.dude.util.GoldStandard
- Initializes the
GoldStandard
with the passed DataSource
.
- GoldStandard(DataSource, String) -
Constructor for class de.hpi.fgis.dude.util.GoldStandard
- Initializes the
GoldStandard
with the passed DataSource
and the filename
of the gold standard in cluster format to read it in.
- GoldStandard(String) -
Constructor for class de.hpi.fgis.dude.util.GoldStandard
- Initializes the
GoldStandard
with the passed filename
of the gold standard in cluster format to read it in.
- goldStandardSet() -
Method in class de.hpi.fgis.dude.postprocessor.StatisticComponent
- Checks whether a gold standard was passed.
- goldStandardSet() -
Method in class de.hpi.fgis.dude.util.Experiment
- Checks whether a
GoldStandard
was set.
- GSwoosh - Class in de.hpi.fgis.dude.algorithm.duplicatedetection
GSwoosh
implements the GSwoosh duplicate detection (and merging) algorithm
as described in the paper Swoosh: a generic approach for entity resolution.- GSwoosh() -
Constructor for class de.hpi.fgis.dude.algorithm.duplicatedetection.GSwoosh
- Initializes the
GSwoosh
algorithm with the DefaultMerger
.
- GSwoosh(Merger) -
Constructor for class de.hpi.fgis.dude.algorithm.duplicatedetection.GSwoosh
- Initializes the
GSwoosh
algorithm with the passed Merger
.
- GSwoosh.ComparisonResult - Enum in de.hpi.fgis.dude.algorithm.duplicatedetection
-
- GSwooshExec - Class in de.hpi.fgis.dude.exec.duplicatedetection
- This execution class runs the GSwoosh duplicate detection algorithm on the
Restaurant
data source. - GSwooshExec() -
Constructor for class de.hpi.fgis.dude.exec.duplicatedetection.GSwooshExec
-
HarmonicMean
returns the harmonic mean of the added SimilarityFunction
s.Jsonable
deserialization.
HarmonicMean
instance.
DuDeObject
contains real data.
is-duplicate
property is set.
HonorificSimilarityFunction
compares two strings and treats them as honorifics, allowing for some special normalization and comparison techniques.HouseNumberSimilarityFunction
compares two strings and treats them as ZIP codes, allowing for some special normalization and comparison techniques.SortedNeighborhoodMethod
.RuntimeException.RuntimeException()
RuntimeException.RuntimeException(String)
RuntimeException.RuntimeException(Throwable)
RuntimeException.RuntimeException(String, Throwable)
IgnoreStrategy
ignores the actual values and returns always the same default similarity.IgnoreStrategy
that returns a default similarity of 0.0
.
IgnoreStrategy
that returns the passed default similarity.
ContentBasedSimilarityFunction
shall make a distinction between lower case and upper case or not.
InMemorySorter
implements a in-memory sort.InMemorySorter
with no SortingKey
.
InMemorySorter
with the passed SortingKey
.
InMemoryStorage
stores Jsonable
instances in memory.InMemoryStorage
instance.
InMemoryStorage
instance with the passed initial content.
InputStreamReadable
can be used to read Json data from any InputStream
.InputStreamReadable
with no type information.
InputStreamReadable
with the passed type information.
Json
String could not be converted into an object.NoSuchElementException.NoSuchElementException()
.
NoSuchElementException.NoSuchElementException(String)
.
NoSuchElementException.NoSuchElementException()
and stores the passed cause.
NoSuchElementException.NoSuchElementException(String)
and stores the passed cause.
Exception.Exception()
.
Exception.Exception(String)
.
Exception.Exception(Throwable)
.
Exception.Exception(String, Throwable)
.
true
if the DuDeObjectPair
exists in the set of real duplicate pairs.
Subkey
was added.
DuDeObject
is a merged object.
true
, if this instance represents the BibTex
"Other authors" value; otherwise
false
.
Iterator
instance for iterating over the algorithm's result.
Schema
instance.
JaccardSimilarityFunction
compares two DuDeObject
s based on the Jaccard Coefficient of the given attribute.Jsonable
deserialization.
JaccardSimilarityFunction
with the default tokenizer.
JaccardSimilarityFunction
with the default tokenizer.
JaccardSimilarityFunction
with the passed InterfaceTokeniser
.
JaccardSimilarityFunction
with the passed InterfaceTokeniser
.
JaroDistanceFunction
compares two DuDeObject
s based on the Jaro Distance of the given attribute.Jsonable
deserialization.
JaroDistanceFunction
with the default tokenizer.
JaroDistanceFunction
with the default tokenizer.
JaroWinklerFunction
compares two DuDeObject
s based on the extended JaroWinkler distance of the given attribute.Jsonable
deserialization.
JaroWinklerFunction
with the default tokenizer.
JaroWinklerFunction
with the default tokenizer.
Jsonable
can be used by classes whose instances shall be Json-convertible.JsonableReader
can be used to read the content of a Jsonable
storage.JsonableWriter
can be used to add data to a JsonWritable
.JsonArray
represents an ordered collection and provides functionality for collecting multiple instances
of JsonValue
.JsonArray
.
JsonArray
with the passed JsonValue
.
JsonArray
with the passed data.
JsonArray
with a predefined capacity.
JsonAtomic
represents all atomic Json types.JsonBoolean
represents a boolean value that can be converted into Json.JsonNull
represents the Json null
value.JsonNumber
represents a Json-convertible number.JsonNumber
with 0
.
JsonNumber
.
JsonOutput
converts the passed DuDeObject
pairs into Json syntax.JsonOutput
formatter.
JsonOutput
formatter.
Jsonable
deserialization.
JsonReadable
is an interface for adding readable functionality to some Jsonable
storage.JsonRecord
represents a Json record.JsonRecord
.
JsonRecord
with the passed initial capacity.
JsonRecord
with the passed data.
JsonRecord
with the passed initial capacity and its load factor.
JSONSource
represents files containing Json
syntax.Jsonable
deserialization.
JSONSource
.
JSONSourceIterator
is used for generating DuDeObject
s out of JSONSource
s.JSONSourceIterator
using the passed JSONSource
.
JsonString
represents a Json-convertible String.AutoJsonSerialization
s.Jsonable
s.JsonValue
provides methods that has to be implemented by every Json data type.JsonValue.JsonType
includes all Json types that can be returned by JsonValue.getType()
.JsonWritable
is an interface for adding writable functionality to some Jsonable
storage.Lego
is an iterative blocking approach.Lego
with the passed SortingKey
's.
LegoIterator
.
Lego
algorithm.LevenshteinDistance
implements an Edit-Distance approach using the Levenshtein Distance
algorithm.LevenshteinDistanceFunction
compares two DuDeObject
s based on the Levenshtein Distance of the given attribute.Jsonable
deserialization.
LevenshteinDistanceFunction
with the default tokenizer.
LevenshteinDistanceFunction
with the default tokenizer.
Properties
.
InputStream
.
Properties
instance.
DuDeObject
based on the data returned by AbstractDataSource.AbstractDataSourceIterator.loadNextRecord()
.
DBInfo
object using a Properties
table.
DBInfo
object.
LookAheadReader
that reads from input
.
AdaptiveSNM_Yan2007
example experiment.
CD
data source.
CORA
data source.
DuDe
extracts data from an XML file and runs the DuplicateCountSNM
algorithm.
DuDe
extracts data from a Json file.
restaurant
data source.
SortedNeighborhoodMethod
on a huge data set.
DuDe
extracts data from an XML file and runs the Sorted-Neighborhood-Method algorithm.
restaurant
data source.
restaurant
data source.
DuDe
extracts data from an XML file and runs the Sorted-Neighborhood-Method algorithm.
SortedNeighborhoodMethod
on a huge data set.
BibtexConcatenatedValue
.
BibtexEntry
.
BibtexMacroDefinition
.
BibtexMacroReference
.
BibtexPerson
.
BibtexPersonList
.
BibtexPreamble
.
BibtexString
.
BibtexToplevelComment
.
MatchingCoefficientFunction
compares two DuDeObject
s based on the Matching Coefficient of the given attribute.Jsonable
deserialization.
MatchingCoefficientFunction
with the default tokenizer.
MatchingCoefficientFunction
with the default tokenizer.
MatchingCoefficientFunction
with the passed InterfaceTokeniser
.
MatchingCoefficientFunction
with the passed InterfaceTokeniser
.
Maximum
returns the maximal similarity of the added SimilarityFunction
s.Jsonable
deserialization.
Maximum
instance.
MemoryChecker
is a Singleton implementation, that maintains the memory usage.DuDeObject
s into one new DuDeObject
.
DuDeObject
s into one new DuDeObject
.
TransitiveClosure
with the current one.
DuDeObject
s into one new DuDeObject
.
DuDeObject
s into one new DuDeObject
.
DuDeObject
s.
DuDeObject
s into a new JsonRecord
.
DuDeObject
s.
DuDeObject
s into a new JsonRecord
.
Merger
is used to merge two DuDeObject
s into one new DuDeObject
.Merger
is used to merge two DuDeObject
s into one new DuDeObject
.Minimum
returns the minimal similarity of the added SimilarityFunction
s.Jsonable
deserialization.
Minimum
instance.
MongeElkanFunction
compares two DuDeObject
s based on the Monge Elkan Distance of the given attribute.Jsonable
deserialization.
MongeElkanFunction
with the default tokenizer.
MongeElkanFunction
with the default tokenizer.
MongeElkanFunction
with the passed InterfaceTokeniser
.
MongeElkanFunction
with the passed InterfaceTokeniser
.
MultipleOutput
to support more than one output formats.MultipleOutput
.
MultipleOutput
.
MySQLDatabase
encapsulates all the necessary information for establishing a connection to a MySQL database.MySQLDatabase
instance members and loads the settings provided by the parameter dbInfo
.
MySQLDatabase
using the passed InputStream
.
MySQLDatabase
using the passed Properties
.
NaiveBlockingAlgorithm
is the naive blocking approach.NaiveBlockingAlgorithm
with the passed SortingKey
.
NaiveBlockingAlgorithm
with the passed SortingKey
.
NaiveDuplicateDetection
implements the naive approach of checking all possible pairs.NaiveDuplicateDetection
instance.
NaiveRecordLinkage
implements the naive approach for record-linkage.NaiveRecordLinkageIterator
implements the actual functionality of the naive record-linkage approach.NaiveRecordLinkageIterator
with the passed data.
NaiveRecordLinkageExec
contains a code-snippet that illustrates, how to use the NaiveRecordLinkage
implementation.NaiveTransitiveClosureGenerator
implements the naive way of generating transitive closures.TransitiveClosure
represents one transitive closure.NaiveTransitiveClosureGenerator.TransitiveClosure
.
NaiveTransitiveClosureGenerator.TransitiveClosureIterator
is used to iterate over all pairs collected or generated by the
NaiveTransitiveClosureGenerator
.NeedlemanWunschFunction
compares two DuDeObject
s based on the Needleman Wunch Distance of the given attribute.Jsonable
deserialization.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
NeedlemanWunschFunction
.
JsonArray
.
JsonBoolean
.
JsonNull
.
JsonNumber
.
JsonRecord
.
JsonString
.
JsonValue
.
null
, if the end of the data source was reached.
NotSupportedStrategy
will throw an IllegalArgumentException
no matter which values were passed.null
value.
NumberBasedSubkey
can be used for number-based sub-keys.Jsonable
deserialization.
NumberBasedSubkey
instance that takes all digits within the value.
NumberBasedSubkey
instance that takes all digits within the value.
NumberBasedSubkey
instance that takes all digits within the value.
NumberBasedSubkey
instance that takes all digits within the value.
DuDeObject's
Json representation.
DuDeObject
.
DuDeObject's
Json representation.
BoundType
around the given raw type with additional type parameters.
BoundType
around the given raw type with additional type parameters.
BoundType
for the given ParameterizedType
.
OracleDatabase
encapsulates all the necessary information for establishing a connection to an Oracle database.OracleDatabase
instance members and loads the settings provided by the parameter dbInfo
.
OracleDatabase
using the passed InputStream
.
OracleDatabase
using the passed Properties
.
OrderedPair
extends Pair
in this way that both elements has to have the same type.OrderedPair
instance with the passed objects.
OverlapCoefficientFunction
compares two DuDeObject
s based on the Overlap Coefficient of the given attribute.Jsonable
deserialization.
OverlapCoefficientFunction
with the default tokenizer.
OverlapCoefficientFunction
with the default tokenizer.
OverlapCoefficientFunction
with the passed InterfaceTokeniser
.
OverlapCoefficientFunction
with the passed InterfaceTokeniser
.
Pair
is a container that stores two objects.MemoryChecker
.
MemoryChecker
.
BibtexParser.bibtexFile
- don't forget to check BibtexParser.getExceptions()
afterwards (if you
don't use throwAllParseExceptions
which you can configure in the constructor)...
BibtexParser
.ParseException
.
PhoneNumberSimilarityFunction
compares two strings and treats them as phone numbers, allowing for some special normalization and comparison techniques.PostGreSQLDatabase
encapsulates all the necessary information for
establishing a connection to a PostGreSQL database.PostGreSQLDatabase
instance members and loads the settings provided by the parameter dbInfo
.
PostGreSQLDatabase
using the passed InputStream
.
PostGreSQLDatabase
using the passed Properties
.
Preprocessor
is an interface
that can be used for gathering statistics of the data
within the extraction phase.PrintWriter
.
PrintWriter
.
DuDeObjectPair
onto all added fuzzy DuDeOutput
s.
DuDeObjectPair
onto all added DuDeOutput
s.
BibTex
is such an insane format...PseudoLexer
.
Token
.
JsonValue
into the JsonRecord
.
JsonBoolean
value into the JsonRecord
.
JsonNumber
value into the JsonRecord
.
JsonNumber
value into the JsonRecord
.
JsonNumber
value into the JsonRecord
.
JsonString
value into the JsonRecord
.
JsonArray
generated out of the passed Collection
and its key to this JsonRecord
.
JsonRecord
generated out of the passed Map
and its key to this JsonRecord
.
JsonNull
into the JsonRecord
.
DuDeJsonParser
.
DuDeJsonParser
.
DuDeJsonParser
.
DuDeJsonParser
.
DuDeJsonParser
.
AutoJsonSerialization.writeWithType(DuDeJsonGenerator, Object)
.Cleanable
instance.
Closeable
instance.
Statement
.
SimilarityFunction
implementation checks the relative variation of the numbers of two DuDeObject
attributes.Jsonable
deserialization.
RelativeNumberDiffFunction
.
RelativeNumberDiffFunction
.
RelativeNumberDiffFunction
.
RelativeNumberDiffFunction
.
UnsupportedOperationException
.
UnsupportedOperationException
.
WarshallClosureGenerator
.
ParameterizedType
superclass for a given BoundType
.class Pair<S, T> { }; class OrderedPair<X> extends Pair<X, X> {}
.TypeVariable
for a given BoundType
.class Foo<T> { Collection<T> bar; }; class IntFoo extends Foo<Integer> {}
.BoundType.of(Integer.class)
.
Restaurant
data source.RSwoosh
implements the RSwoosh duplicate detection (and merging) algorithm
as described in the paper Swoosh: a generic approach for entity resolution.RSwoosh
algorithm with an instance of the DefaultMerger
.
RSwoosh
algorithm with the passed Merger
.
Restaurant
data source.AdaptiveSNME_Yan2007
Properties
format.
Schema
encapsulates all the information concerning a database table schema.Schema
using a given collection of ColumnInfo
instances.
Schema
out of the passed table.
Experiment
.
DataSource
.
JsonValue
instance to an attribute of the passed JsonRecord
.
StatisticComponent.setStartTime()
JsonArray
s.
JsonArray
s and atomic values.
JsonArray
s and JsonRecord
s.
JsonRecord
s and atomic values.
JsonRecord
s.
ContentBasedSimilarityFunction
to the CrossProductStrategy.
ContentBasedSimilarityFunction
to the CrossProductStrategy.
SortingKey
.
is-duplicate
property.
Algorithm
-Object).
StatisticComponent.setEndTime()
Experiment
.
DuDeObject
s in memory, if file-based processing is enabled.
GSwoosh
algorithm of the result of the last comparison.
RSwoosh
algorithm of the result of the last comparison.
DuDeObjectPair
.
SimilarityFunction
.
SortingKey
.
SortingKey
.
SimilarityFunction
is used to determine the similarity of two DuDeObject
's.SimilarityValidationState
is a descriptor whether two values could be used for similarity calculation or not.SimilarityFunction
was set.
SimmetricsFunction
is a skeleton class providing the common functionality of all Simmetric
similarity functions.Jsonable
deserialization.
SimmetricsFunction
with the passed metric and the default values.
SimmetricsFunction
with the passed metric and the default values.
SimpleStatisticOutput
prints the statistics in a simple, formatted fashion.Jsonable
deserialization.
SimpleStatisticOutput
with no statistics.
SimpleStatisticOutput
.
SimpleStatisticOutput
with no statistics.
SimpleStatisticOutput
.
SimpleTextOutput
writes the passed DuDeObject
pair to an OutputStream
line by line.SimpleTextOutput
with the passed parameters.
SimpleTextOutput
with the passed parameters.
SimpleTextOutput
with the passed parameters.
SimpleTextOutput
with the passed parameters.
SimpleTextOutput
with the passed parameters.
SimpleTextOutput
with the passed parameters.
Jsonable
deserialization.
1
since JsonBoolean
is an atomic value.
0
since JsonRecord.JsonNull
does not have any value.
1
since JsonNumber
is an atomic value.
1
since JsonString
is an atomic value.
SmithWatermanDistance
implements the Smith-Waterman distance.SmithWatermanFunction
compares two DuDeObject
s based on the Smith Waterman Distance of the given attribute.Jsonable
deserialization.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanFunction
.
SmithWatermanGotohFunction
compares two DuDeObject
s based on the Smith Waterman Gotoh Distance of the given attribute.Jsonable
deserialization.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SmithWatermanGotohFunction
.
SortedBlocks
combines blocking and the SNM method.SortedBlocks
instance using fixed size blocks
with the passed windows size.
SortedDataFile
encapsulates the functionality that is needed for the TwoPhaseMultiWayMergeSorter
in phase two.SortedDataFile
and loads the first element.
SortedNeighborhoodMethod
is a simple Sorted-Neighborhood Method implementation without allowing multiple runs.SortedNeighborhoodMethod
instance with the passed SortingKey
the SortedNeighborhoodMethod.DEFAULT_WINDOW_SIZE
.
SortedNeighborhoodMethod
instance with the passed SortingKey
and a window size.
SortedNeighborhoodMethod.SortedNeighborhoodMethodIterator
implements the behavior of a simple Sorted-Neighborhood-Method
implementation.SortedNeighborhoodMethod
algorithm.SortingDuplicateDetection
implements the preprocessing phase were the data is sorted based on a given SortingKey
.SortingDuplicateDetection
with the passed SortingKey
.
SortingKey
collects different sub-keys and compares different DuDeObject
based on these sub-keys.SortingKey
with no sub-key(s).
SortingKey
instance.
SortingRecordLinkage
implements the preprocessing phase were the data is sorted based on one or several SortingKey
s.SortingRecordLinkage
with no default SortingKey
.
SortingRecordLinkage
with the passed default SortingKey
.
SoundEx
implements a phonetic algorithm for indexing names by sound.SoundExFunction
compares two DuDeObject
s based on the SoundEx values of the given attribute.Jsonable
deserialization.
SoundExFunction
with the passed default attribute.
SoundExFunction
with the passed default attribute.
DuDeObject
.
StableMarriageStrategy
implements the Stable-Marriage
algorithm.StatisticComponent
provides functionality for gathering statistics concerning the recall, precision and f-measure.StatisticComponent
with no gold standard.
StatisticComponent
using the passed DuDeObjectPair
s as real duplicates.
Jsonable
deserialization.
StatisticOutput
offers all methods needed to write out the statistics provided by a StatisticComponent
instance.StatisticOutput
instance is set.
StreetSimilarityFunction
compares two strings and treats them as street names, allowing for some special normalization and comparison techniques.StringSimilarity
is an interface for comparing Strings.Subkey
is an interface
that is used within the SortingKey
implementation.TextBasedSubkey
provides the functionality for generating sub-keys based on String values.Jsonable.fromJson(DuDeJsonParser)
.
SortedNeighborhoodMethod
and preprocessing for the tf-idf comparator.TFIDFSimilarityFunction
compares two DuDeObject
s based on the classic tf-idf metric.Jsonable
deserialization.
TFIDFSimilarityFunction
object for the passed attribute.
TFIDFSimilarityFunction
object for the passed attribute.
TFIDFSimilarityFunction
object for the passed attribute.
TFIDFSimilarityFunction
object for the passed attribute.
TitleSimilarityFunction
compares two strings and treats them as person's titles, allowing for some special normalization and comparison techniques.Jsonable
and returns the resulting json string.
Jsonable
and returns the resulting json string.
DuDeJsonGenerator
.
DuDeObject
to its Json representation.
true
.
TwoPhaseMultiWayMergeSorter
implements a file-based sorting using the Two-Phase Multi-Way Merge-Sort algorithm (TPMMS).TwoPhaseMultiWayMergeSorter
with no SortingKey
.
TwoPhaseMultiWayMergeSorter
with the passed SortingKey
.
DataSource
s.
WarshallTransitiveClosureGenerator
implements the Warshall algorithm to calculate the transitive closure.WarshallTransitiveClosureGenerator.AdjacencyList
is the adjacency list representation of the added pairs.WarshallTransitiveClosureGenerator.AdjacencyList
WarshallTransitiveClosureGenerator.AdjacencyMatrix
is the matrix representation of the added pairs.WarshallTransitiveClosureGenerator.AdjacencyMatrix
WarshallTransitiveClosureGenerator.GraphRepresentation
is an interface that should be implemented by all classes representing a
graph of duplicates.WarshallTransitiveClosureGenerator.IntPair
is used to create a pair of integer values.WarshallTransitiveClosureGenerator.TransitiveClosureIterator
is used to iterate over all pairs collected or generated by the
WarshallTransitiveClosureGenerator
.WarshallTransitiveClosureGenerator.TransitiveClosureIterator
GoldStandard.setFirstElementsSourceId(String)
and GoldStandard.setFirstElementsObjectIdAttributes(String...)
.
GoldStandard.setSecondElementsSourceId(String)
and GoldStandard.setSecondElementsObjectIdAttributes(String...)
.
DuDeObjectPair
onto an stream.
DuDeObject
pair line by line.
DuDeJsonGenerator
.
DuDeJsonGenerator
.
DuDeObjectPair
onto the stream, if it is flagged as a duplicate.
DuDeJsonGenerator
.
DuDeJsonGenerator
.
JsonArray
.
JsonBoolean
.
JsonNull
value to the stream.
JsonNumber
.
JsonRecord
.
JsonString
.
JsonValue
.
OutputStream
.
OutputStream
.
XMLSource
represents *.xml files.Jsonable
deserialization.
XMLSource
that converts all elements in the first XML layer into JsonRecord
s.
XMLSource
that converts all direct child elements of the given root into DuDeObject
s.
XMLSourceIterator
is used for generating DuDeObject
s out of XMLSource
s.XMLSourceIterator
using the passed XMLSource
.
ZIPSimilarityFunction
compares two strings and treats them as ZIP codes, allowing for some special normalization and comparison techniques.
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |