Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

05.07.2011

Wikipedia extraction data published

With iPopulator, we have introduced a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text.  In contrast to prior work, iPopulator detects and exploits the structure of attribute values to independently extract value parts. 

We ran iPopulator on the complete Wikipedia dump and successfully extracted many new infobox attribute values. In total, we extracted 259,892 new facts with an extraction precision above 80%. We provide the extracted data in different formats on the iPopulator project website.