Che, Xiaoyin; Yang, Haojin; Meinel, Christoph
Pacific-Rim Symposium on Image and Video Technology
Auckland, New Zealand
In this paper we propose a solution to detect tables from slide images. Presentation slides are one type of document with growing importance. But the layout difference between slides and traditional documents makes many existing table detection methods less effective on slides. The proposed solution works with both high-resolution slide images from digital files and low-resolution slide screenshots from videos. By taking OCR (Optical Character Recognition) as initial step, a heuristic analysis on page layout focuses not only on the table structure but also the textual content. The evaluation result shows that the proposed solution achieves an approximate accuracy of 80 %. It is way better than the open-source academic solution Tesseract and also outperforms the commercial software ABBYY FineReader, which is supposed to be one of the best table detection tools.