Learning and understanding multimedia content is a challenging task in the research field of information retrieval and multimedia analysis. Deep Learning (DL), as a new area of machine learning (since 2006), has already been impacting a wide range of multimedia information processing. Recently, the techniques developed based on DL achieved substantial progress in fields including Speech Recognition, Image Classification and Language Processing. It has been proved that through simulating human neural network and hierarchically (layer-by-layer) learning features from large scale data can significantly improve analytic results. In this project, we focus on developing multimedia retrieval approaches based on DL technologies.
Current research topics:
Multimedia analysis and computer vision based on Deep Learning and intelligent data synthesis
- End-to-end scene text detection and recognition in real-time using deep neural networks
- Neural visual translator: image/video captioning
- Human action recognition, event detection in surveillance video
- Multimodal data retrieval with deep neural networks
- Semantic text analysis using Word2Vec and ConvNet (e.g. sentence boundary detection in speech transcript)
- Deep Learning in medical image processing e.g. brain abnormality detection
Research in deep learning algorithms
- Binary Neural Networks: enable deep learning models on low power devices
- Generative model learning