Yang, Haojin; Wang, Cheng; Che, Xiaoyin; Luo, Sheng; Meinel, Christoph
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR 2015)
In this paper we showcase a system for real-time text detection and recognition. We apply deep features created by Convolutional Neural Networks (CNNs) for both text detection and word recognition task. For text detection we follow the common localization-verification scheme which already shown its excellent ability in numerous previous work. In text localization stage, textual regions are roughly detected by using a MSERs (Maximally Stable Extremal Regions) detector with high recall rate. False alarms are then eliminated by using a CNNs classifier, and remaining text regions are further grouped into words. In the word recognition stage, we developed an skeleton-based text binarization method for segmenting text from its background. A CNNs based recognizer is then applied for recognizing character. The initial experiments show the powerful ability of deep features for text classification comparing with commonly used visual features. Our current implementation demonstrates real-time performance for recognizing scene text by using a standard PC with webcam.