de.hpi.fgis.dude.output
Class CSVOutput

java.lang.Object
  extended by de.hpi.fgis.dude.output.CSVOutput
All Implemented Interfaces:
DuDeOutput, AutoJsonable, Jsonable

public class CSVOutput
extends Object
implements DuDeOutput, Jsonable

Writes passed DudeObjectPairs , their similarity value and selected optional value in a CSV file row by row.

Author:
Ziawasch Abedjan

Field Summary
static char DEFAULT_ESCAPE_CHARACTER
          The default escape character.
static char DEFAULT_QUOTE_CHARACTER
          The default quote character.
static char DEFAULT_SEPARATOR
          The default separator character.
protected  String[] defaultColumnNames
          The default header.
 
Constructor Summary
protected CSVOutput()
          Internal constructor for Jsonable deserialization.
  CSVOutput(File file)
          Initializes a new CSVOutput.
  CSVOutput(OutputStream stream)
          Initializes a new CSVOutput with the passed OutputStream.
  CSVOutput(Writer writer)
          Initializes a new CSVOutput.
 
Method Summary
 void close()
          Closes the stream.
 void disablePrintingCompleteIdentifier()
          If this is disabled, the source id won't be printed.
 void enablePrintingCompleteIdentifier()
          If this is enabled, the complete identifier is printed.
 void fromJson(DuDeJsonParser<?> jsonParser)
          Initializes the current instance using the passed DuDeJsonParser.
protected  String[] getDataLine(DuDeObjectPair pair)
          Generates the data that shall be printed.
 char getEscapeCharacter()
          Returns the escape character.
protected  String[] getHeader()
          Returns the header.
 char getQuoteCharacter()
          Returns the quote character.
 char getSeparator()
          Returns the separator character.
protected static String getString(Object obj)
          Returns the String representation of the object or null, if null was passed.
 boolean headerIsEnabled()
          Checks whether the header shall be written.
 boolean printingCompleteIdentifierEnabled()
          Checks whether printing the complete identifier is enabled.
 boolean printingDataEnabled()
          Checks whether printing the data is enabled.
 void resetOptionalColumns()
          Resets the values of all optional columns using empty Strings.
 void setEscapeCharacter(char escapeCharacter)
          Sets the escape character.
 boolean setOptionalColumn(String identifier)
          Sets a new optional column with no value.
 boolean setOptionalColumn(String identifier, String value)
          Sets a new optional column with the passed value.
 void setQuoteCharacter(char quoteCharacter)
          Sets the quote character.
 void setSeparator(char sep)
          Sets the separator character.
 void toJson(DuDeJsonGenerator jsonGenerator)
          Generates the Json code using the passed DuDeJsonGenerator.
 DuDeOutput withData()
          Enables printing the data.
 CSVOutput withHeader()
          Writing the header before writing the first pair is enabled.
 DuDeOutput withoutData()
          Disables printing the data.
 CSVOutput withoutHeader()
          Writing the header before writing the first pair is disabled.
 void write(DuDeObjectPair pair)
          Writes the Ids of the DuDeObjects their similarity value and specified optional values into the file.
 void writeDuplicatesOnly(DuDeObjectPair pair)
          Writes the passed DuDeObjectPair onto the stream, if it is flagged as a duplicate.
protected  void writeHeader()
          Writes the header into the output.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_SEPARATOR

public static final char DEFAULT_SEPARATOR
The default separator character.

See Also:
Constant Field Values

DEFAULT_QUOTE_CHARACTER

public static final char DEFAULT_QUOTE_CHARACTER
The default quote character.

See Also:
Constant Field Values

DEFAULT_ESCAPE_CHARACTER

public static final char DEFAULT_ESCAPE_CHARACTER
The default escape character.

See Also:
Constant Field Values

defaultColumnNames

protected final String[] defaultColumnNames
The default header.

Constructor Detail

CSVOutput

public CSVOutput(File file)
          throws IOException
Initializes a new CSVOutput.

Parameters:
file - The file that is used for this output.
Throws:
IOException - if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason

CSVOutput

public CSVOutput(OutputStream stream)
Initializes a new CSVOutput with the passed OutputStream.

Parameters:
stream - The stream that is used for printing the result.

CSVOutput

public CSVOutput(Writer writer)
Initializes a new CSVOutput.

Parameters:
writer - The writer that is used for this output.

CSVOutput

protected CSVOutput()
Internal constructor for Jsonable deserialization.

Method Detail

getString

protected static String getString(Object obj)
Returns the String representation of the object or null, if null was passed.

Parameters:
obj - The object whose String representation shall be returned.
Returns:
The String representation of the passed object or null, if null was passed.

close

public void close()
           throws IOException
Description copied from interface: DuDeOutput
Closes the stream.

Specified by:
close in interface DuDeOutput
Throws:
IOException - If an error occurs while closing the stream.

disablePrintingCompleteIdentifier

public void disablePrintingCompleteIdentifier()
If this is disabled, the source id won't be printed.


enablePrintingCompleteIdentifier

public void enablePrintingCompleteIdentifier()
If this is enabled, the complete identifier is printed.


fromJson

public void fromJson(DuDeJsonParser<?> jsonParser)
              throws org.codehaus.jackson.JsonParseException,
                     IOException
Description copied from interface: Jsonable
Initializes the current instance using the passed DuDeJsonParser.

Specified by:
fromJson in interface Jsonable
Parameters:
jsonParser - The parser that is used for extracting the data out of the Json.
Throws:
org.codehaus.jackson.JsonParseException - If an error occurs while parsing the Json.
IOException - If an error occurs while reading from the stream.

getDataLine

protected String[] getDataLine(DuDeObjectPair pair)
Generates the data that shall be printed.

Parameters:
pair - The pair whose information shall be printed.
Returns:
The data that shall be written into the output.

getEscapeCharacter

public char getEscapeCharacter()
Returns the escape character.

Returns:
The escape character.

getHeader

protected String[] getHeader()
Returns the header. The header contains all default column names and any column extensions.

Returns:
The header of this output.

getQuoteCharacter

public char getQuoteCharacter()
Returns the quote character.

Returns:
The quote character.

getSeparator

public char getSeparator()
Returns the separator character.

Returns:
The separator character.

headerIsEnabled

public boolean headerIsEnabled()
Checks whether the header shall be written.

Returns:
true, if the header will be written before the first pair is printed; otherwise false.

printingCompleteIdentifierEnabled

public boolean printingCompleteIdentifierEnabled()
Checks whether printing the complete identifier is enabled.

Returns:
true, if it is enabled; otherwise false.

printingDataEnabled

public boolean printingDataEnabled()
Checks whether printing the data is enabled.

Returns:
true, if it is enabled; otherwise false.

resetOptionalColumns

public void resetOptionalColumns()
Resets the values of all optional columns using empty Strings.


setEscapeCharacter

public void setEscapeCharacter(char escapeCharacter)
Sets the escape character.

Parameters:
escapeCharacter - The new escape character.

setOptionalColumn

public boolean setOptionalColumn(String identifier)
Sets a new optional column with no value.

Parameters:
identifier - The column's identifier.
Returns:
true, if a new column was added; otherwise false.

setOptionalColumn

public boolean setOptionalColumn(String identifier,
                                 String value)
Sets a new optional column with the passed value.

Parameters:
identifier - The column's identifier.
value - The column's value.
Returns:
true, if a new column was added; otherwise false.

setQuoteCharacter

public void setQuoteCharacter(char quoteCharacter)
Sets the quote character.

Parameters:
quoteCharacter - The new quote character.

setSeparator

public void setSeparator(char sep)
Sets the separator character.

Parameters:
sep - The new separator character.

toJson

public void toJson(DuDeJsonGenerator jsonGenerator)
            throws org.codehaus.jackson.JsonGenerationException,
                   IOException
Description copied from interface: Jsonable
Generates the Json code using the passed DuDeJsonGenerator.

Specified by:
toJson in interface Jsonable
Parameters:
jsonGenerator - The DuDeJsonGenerator that is used internally.
Throws:
org.codehaus.jackson.JsonGenerationException - If an error occurs while generating the Json syntax.
IOException - If an error occurs while writing to the output.

withData

public DuDeOutput withData()
Description copied from interface: DuDeOutput
Enables printing the data. This means that the output does not only consist of the object ids but contains also the corresponding data.

Specified by:
withData in interface DuDeOutput
Returns:
The current instance.

withHeader

public CSVOutput withHeader()
Writing the header before writing the first pair is enabled.

Returns:
The current instance.

withoutData

public DuDeOutput withoutData()
Description copied from interface: DuDeOutput
Disables printing the data. This means that each printed duplicate pair is only specified by their source and object ids.

Specified by:
withoutData in interface DuDeOutput
Returns:
The current instance.

withoutHeader

public CSVOutput withoutHeader()
Writing the header before writing the first pair is disabled.

Returns:
The current instance.

write

public void write(DuDeObjectPair pair)
           throws IOException
Writes the Ids of the DuDeObjects their similarity value and specified optional values into the file.

Specified by:
write in interface DuDeOutput
Parameters:
pair - A pair of two DuDeObjects that are written into an OutputStream.
Throws:
IOException - If an error occurs while writing onto the stream.

writeDuplicatesOnly

public void writeDuplicatesOnly(DuDeObjectPair pair)
                         throws IOException
Description copied from interface: DuDeOutput
Writes the passed DuDeObjectPair onto the stream, if it is flagged as a duplicate.

Specified by:
writeDuplicatesOnly in interface DuDeOutput
Parameters:
pair - The pair that shall be written to the stream.
Throws:
IOException - If an error occurs while writing onto the stream.
See Also:
DuDeObjectPair.getDuplicateInfo()

writeHeader

protected void writeHeader()
                    throws IOException
Writes the header into the output.

Throws:
IOException - If an error occurs during the write process.


Copyright © 2011 Hasso Plattner Institute - Chair of Information Systems. All Rights Reserved.