Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI
  
Login
 

Table Recognition (Sommersemester 2021)

Lecturer: Prof. Dr. Felix Naumann (Information Systems) , Gerardo Vitagliano (Information Systems)
Course Website: https://hpi.de/en/naumann/teaching/current-courses/ss-21/table-recognition.html

General Information

  • Weekly Hours: 4
  • Credits: 6
  • Graded: yes
  • Enrolment Deadline: 18.03.2021 - 09.04.2021
  • Teaching Form: Seminar
  • Enrolment Type: Compulsory Elective Module
  • Course Language: English
  • Maximum number of participants: 6

Programs & Modules

IT-Systems Engineering MA
Data Engineering MA
  • PREP-Konzepte und Methoden
  • PREP-Techniken und Werkzeuge
  • PREP-Spezialisierung

Description

Structured files, like spreadsheets, are valuable sources of data, but often ill-suited for machine-consumption. Although spreadsheets contain cells in a grid-like structure, the data they contain is often arranged with a free layout, with no clearly defined tabular structure. Or worse, tables are arranged in several, independent regions that have to ultimately be recognized and merged by end-users which are interested in their content. In light of automated data preparation, extraction, or integration, there is great value in recognizing the presence and layout of regions, especially tables, within a spreadsheet.

Table recognition is a well-known problem, tackled by different researchers on various domains, and with different assumptions. In this seminar, we will introduce you to the research area of table recognition in spreadsheet files. Each team, ideally consisting of 2 students, will explore, implement and potentially improve on a different solution to detect and extract tables from spreadsheet files.

We will provide you with state of the art papers that suggest solutions to the above problem, which you will implement and try to improve upon with your own ideas in a scalable way. We will provide thousands of files for testing and evaluation.

Please see Website for more details.

Zurück