GovWILD: Integrating Open Government Data for Transparency
Christoph Böhm, Markus Freitag, Arvid Heise, Claudia Lehmann, Andrina Mascher, Felix Naumann, Vuk Ercegovac, Mauricio Hernandez, Peter Haase, and Michael Schmidt
accepted for the demo track at the International World Wide Web Conference 2012
Many government organizations publish a variety of data on the web to enable transparency, foster applications, and to satisfy legal obligations. Data content, format, structure, and quality vary widely, even in cases where the data is published using the wide-spread linked data principles. Yet within this data and their integration lies much value: We demonstrate GovWILD, a web-based prototype that integrates and cleanses Open Government Data at a large scale. Apart from the web-based interface that presents a use case of the created dataset at govwild.org, we provide all integrated data as a download. This data can be used to answer questions about politicians, companies, and government funding.
Holistic and Scalable Ontology Alignment for Linked Open Data
Toni Grütze, Christoph Böhm, and Felix Naumann
accepted at Linked Data on the Web workshop 2012
The Linked Open Data community continuously releases massive amounts of RDF data that shall be used to easily create applications that incorporate data from different sources. Inter-operability across different sources requires links at instance- and at schema-level, thus connecting entities on the one hand and relating concepts on the other hand. State-of-the-art entity- and ontology-alignment methods produce high quality alignments for two nicely structured individual sources, where an identification of relevant and meaningful pairs of ontologies is a precondition. Thus, these methods cannot deal with heterogeneous data from many sources simultaneously, e.g., data from a linked open data web crawl.
To this end we propose Holistic Concept Matching HCM.
HCM aligns thousands of concepts from hundreds of ontologies (from many sources) simultaneously, while maintaining scalability and leveraging the global view on the entire data cloud. We evaluated our approach against the OAEI ontology alignment benchmark as well as on the 2011 Billion Triple Challenge data and present high precision results created in a scalable manner.