Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

DBLP-Scholar dataset

Contains bibliographic entries across DBLP and Google Scholar. Obtained from here and has been processed into the following:

  • Dataset
    • We have merged the two relations into a single file to use it for deduplication. A simple data preparation of lower-casing and removing of special characters has been applied. Available in tab separated value (TSV) format. (66,879 objects - TSV format)
  • Duplicates
  • Non-duplicates