Hasso-Plattner-Institut
  
 

Matthias Bauer

Hasso-Plattner-Institut (HPI) für
Softwaresystemtechnik GmbH
Universität Potsdam
Prof.-Dr.-Helmert-Str. 2-3
D-14482 Potsdam
Germany 

office:

H-1.21

phone:

+49 (0)331-5509-385

fax:

+49 (0)331-5509-325

e-mail:

matthias.bauer(at)hpi.de

Find me on LinkedIn and on XING

Teaching - openHPI

Teaching - IT Systems Engineering

Winter Semester 2017

Summer Semester 2016

Winter Semester 2015

  • Seminar: Web-Programmierung und Web-Frameworks

Summer Semester 2015

  • Lecture Assistant: Internet- und WWW-Technologien
  • Tutorial: Internet- und WWW-Technologien
  • Seminar: Weiterführende Themen zu Internet- und WWW-Technologien
  • Schülerkolleg: Internetsuche und Google Page-Rank

Winter Semester 2014/2015

  • Seminar: Web-Programmierung und Web-Frameworks

Summer Semester 2014

  • Lecture Assistant: Internet- und WWW-Technologien
  • Tutorial: Internet- und WWW-Technologien
  • Seminar: Weiterführende Themen zu Internet- und WWW-Technologien

Winter Semester 2013/2014

Summer Semester 2013

  • Lecture Assistant: Internet- und WWW-Technologien
  • Tutorial: Internet- und WWW-Technologien
  • Seminar: Weiterführende Themen zu Internet- und WWW-Technologien

Winter Semester 2012/2013

  • Seminar: Web-Programmierung und Web-Frameworks

Summer Semester 2012

Winter Semester 2011/2012

Summer Semester 2011

  • Tutorial: Internet- und WWW-Technologien
  • Seminar: Weiterführende Themen zu Internet- und WWW-Technologien

Teaching - Kids

Summer Semester 2015

  • Schülerkolleg: Internetsuche und Google Page-Rank

Publications

Enhance Lecture Archive Search with OCR Slide Detection and In-Memory Database Technology

Martin Malchow, Matthias Bauer, Christoph Meinel
In 2015 IEEE 18th International Conference on Computational Science and Engineering (CSE), pages 176-183, 10 2015 IEEE.

DOI: 10.1109/CSE.2015.19

Abstract:

On the Web there are a lot of frequently used video lecture archives which have grown up fast during the last couple of years. This fact led to a lot of lecture recordings which include knowledge for a variety of subjects. The typical way of searching these videos is by title and description. Unfortunately, not all important keywords and facts are mentioned in the title or description if they are available. Furthermore, there is no possibility to analyze how important those detected keywords are for the whole video. Another lecture archive specific virtue is that every regular university lecture is repeated yearly. Normally this will lead to duplicate lecture recordings. In search results doubling is disturbing for students when they want to watch the most recent lectures from the search result. This paper deals with the idea to resolve these problems by analyzing the recorded lecture slides with Optical Character Recognition (OCR). In addition to the name and description the OCR data will be used for a full text analysis to create an index for the lecture archive search. Furthermore, a fuzzy search is introduced. This will solve the issue of misspelled search requests and OCR detection defects. Additionally, this paper deals with the performance issues of a full text search with an in-memory database, issues in OCR detection, handling duplicate recordings of lectures repeated every year. Finally, an evaluation of the search performance in comparison with other database ideas besides the in-memory database is performed. Additionally, a user acceptability survey for the search results to increase the learning experience on lecture archives was performed. As a result, this paper shows how to handle the big amount of OCR data for a full text live search performed on an in-memory database in reasonable time. During this search a fuzzy search is performed additionally to resolve spelling mistakes and OCR detection problems. In conclusion this paper shows a solution for an enhanced video lecture archive search that supports students in online research processes and enhances their learning experience.

Keywords:

Teleteaching;Tele-Lecturing;Distance Learning;E-Learning;OCR Search;Fuzzy Search;In-Memory Database

BibTeX file

@inproceedings{Martin2015a,
author = { Martin Malchow, Matthias Bauer, Christoph Meinel },
title = { Enhance Lecture Archive Search with OCR Slide Detection and In-Memory Database Technology },
year = { 2015 },
pages = { 176-183 },
month = { 10 },
abstract = { On the Web there are a lot of frequently used video lecture archives which have grown up fast during the last couple of years. This fact led to a lot of lecture recordings which include knowledge for a variety of subjects. The typical way of searching these videos is by title and description. Unfortunately, not all important keywords and facts are mentioned in the title or description if they are available. Furthermore, there is no possibility to analyze how important those detected keywords are for the whole video. Another lecture archive specific virtue is that every regular university lecture is repeated yearly. Normally this will lead to duplicate lecture recordings. In search results doubling is disturbing for students when they want to watch the most recent lectures from the search result. This paper deals with the idea to resolve these problems by analyzing the recorded lecture slides with Optical Character Recognition (OCR). In addition to the name and description the OCR data will be used for a full text analysis to create an index for the lecture archive search. Furthermore, a fuzzy search is introduced. This will solve the issue of misspelled search requests and OCR detection defects. Additionally, this paper deals with the performance issues of a full text search with an in-memory database, issues in OCR detection, handling duplicate recordings of lectures repeated every year. Finally, an evaluation of the search performance in comparison with other database ideas besides the in-memory database is performed. Additionally, a user acceptability survey for the search results to increase the learning experience on lecture archives was performed. As a result, this paper shows how to handle the big amount of OCR data for a full text live search performed on an in-memory database in reasonable time. During this search a fuzzy search is performed additionally to resolve spelling mistakes and OCR detection problems. In conclusion this paper shows a solution for an enhanced video lecture archive search that supports students in online research processes and enhances their learning experience. },
keywords = { Teleteaching;Tele-Lecturing;Distance Learning;E-Learning;OCR Search;Fuzzy Search;In-Memory Database },
publisher = { IEEE },
booktitle = { 2015 IEEE 18th International Conference on Computational Science and Engineering (CSE) },
isbn = { 978-1-4673-8297-7 },
priority = { 0 }
}

Copyright Notice

last change: Mon, 26 Oct 2015 10:22:57 +0100

Press

So wird das Haus schlauSaarbrücker Zeitung 03.06.2016

Events and Activities