Machine Learning and Multimedia Data Analysis

Introduction

Artificial intelligence (AI) is the intelligence exhibited by computer. This term is applied when a machine mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving". Currently researchers in this field are making efforts to AI and machine learning which intend to train the computer to mimic some human skills such as "reading", "listening", "writing" and "making inference" etc. Some AI applications such as optical character recognition (OCR) and speech recognition (ASR) recently become routine technologies in industry. From the year 2006 "Deep Learning" (DL) has attracted more and more attentions in both academia and industry. Deep learning by deep neural networks is a branch of machine learning based on a set of algorithms that attempt to learn representations of data and model their high level abstractions. In a deep neural network, there are multiple so-called "neural layers" between the input and output. The algorithm is allowed to use those layers to learn higher abstraction, composed of multiple linear and non-linear transformations. Recently DL gives us break-record results in many novel areas as, e.g., beating human in strategic game systems like Go (Google’s AlphaGo), self-driving cars, achieving dermatologist-level classification of skin cancer etc.

Multimedia data is one of most suitable objectives for deep learning research, because of its multiple modalities. Multimedia consists of visual, textual and auditory content. This specific feature could enable DL algorithms to learn fused representations from hybrid resources, which illustrate common semantic meaning. On the other hand, DL is a data-driven technology which makes it highly suitable to process massive amounts of multimedia data.

Table of Contents

  • Research Topics and Projects
  • Research Team
  • Current News
  • Master Theses
  • Teaching
  • Recent Publications

Our Research Topics and Projects

Current topics

  • Efficient Deep Learning
    • Deep model compression and acceleration
    • Binary Neural Networks
    • Dynamic Networks
  • Multimodal intelligence
  • Weakly supervised learning, dataset synthesis
  • Domain generalization, novel class discovery

KI-Leuchttürme project for environment, climate, nature and resources

EKAPEx: New efficient AI algorithms for innovative forecasting methods for extreme weather events (2023-2025)

HPI will act as the coordinator for the project and collaborate with machine learning experts from the Technical University of Munich (TUM) and atmospheric and meteorological experts from the GeoForschungsZentrum Potsdam (GFZ). The aim of the project is to develop AI-based precipitation forecasting for Germany, with a special emphasis on extreme weather events. To accomplish this, the team will develop the most efficient and powerful AI algorithms possible, while also significantly reducing resource consumption. Unique datasets, such as Integrated Water Vapor and Slant Integrated Water Vapor obtained from GNSS observations, will also be utilized to enhance forecasting capabilities. The project aims to develop an accessible platform that will contribute significantly to the improvement of climate adaptation measures and the sustainable use of AI.

The team will employ efficient designs of AI algorithms, such as few-shot learning, zero-shot learning, and open-set recognition methods, in order to decrease dependence on large amounts of data and manual annotation for weather forecasting. Additionally, neural networks will be applied that can operate with a lower bitrate by converting the parameters and intermediate results of the network from previous 32-bit models to a binary value with only one bit, while minimizing accuracy loss. The project will also specifically address power consumption of AI methods as a source of greenhouse gases.

Deep Learning for Enterprise NLP Applications

Project partner: SAP Conversational AI team (2017-2020)

In this project we will develop a framework for building general-applicable as well as domain-specific NLP models by using state-of-the-art deep learning technology. The research problem on textual representation learning will be studied intended to find the most efficient solution for deep neural network design, and system implementation. The evaluation protocol will be defined and developed for the qualitative and quantitative evaluation.

Project partner: SAP ICN Machine Learning team (2020-2022)

The recently emerged large-scale pre-trained language models based on the Transformer model, such as GPT-3 (175 billion parameters) and Switch Transformer (1600 billion parameters), have brought about a series of breakthroughs in many Natural Language Processing (NLP) tasks. However, the training of these large-scale models is computationally expensive. Moreover, these models generally have billions of parameters, making it challenging to conduct inference on resource-limited devices. In this project, we will dive into how such large scale models work, study different approaches to decrease their space and time complexity during training and inference, and evaluate them on different Natural Language Understanding (NLU) and Natural Language Generation (NLG) benchmarks.

Binary Neural Networks, Deep Model Compression and Acceleration

Project partner: PyTorch, NICSEFCMXNet

In recent years, deep learning technologies achieved excellent performance and many breakthroughs in both academia and industry. However the state-of-the-art deep models are computational expensive
and consume large storage space. Deep learning is also strongly demanded by numerous applications from areas such as mobile platforms, wearable devices, autonomous robots and IoT devices. How to efficiently apply deep models on such low power devices becomes a challenging research problem. In this project we will explore several different approaches such as binarized, quantized as well as lightweight deep neural networks for this problem. The development is based on well known open source deep learning library PyTorch and Apache MXNet. As a in progress research result we have developed two open source frameworks:

Image Analysis in a Large Scale Art Historical Database (2019-2023)

Project partner: Wildenstein Plattner Institute

With increasing digitization and storage capacities, it becomes more and more viable to undergo massive digitization projects for analogue archives. Digitization allows easy access and long term preservation of old and sensitive physical material, where access is typically denied. Furthermore, digitization allows the material to be processed more efficiently. In this project, we aim to develop and apply novel automatic processing methods for the digitized archive of the WPI. Since Archival material, especially in the art history domain, contains many images and handwriting, we concentrated on analysing and extracting handwritten information. Challenges, which should be addressed in this project are scalability and quality of different approaches for handwriting recognition. The digitization project that the WPI is undergoing covers a document corpus of many million pages in different fonts, languages and physical condition. 

Besides handwriting as one important type of semantic information in an archive, a digitized archive also contains many scans of documents that contain images. These images may be photographs, reproductions of works of art, or even sketches. A digitization pipeline would greatly benefit from additional analysis steps extracting metadata from such documents. In this line of work further analysis steps, such as classification of documentsby visual appearance, automatic creation of textual metadata (i.e. descriptions) of images, and recognition of depicted objects in images shall be added to the resulting digitization pipeline. All of the developed approaches shall be incorporated into a system usable by the researchers of the WPI by incorporation into their cataloguing software.

Intelligent Lecture Video Analysis and Retrieval

Project partner: tele-TASK and openHPI team

  • Video Lecture Browser: Lecture video content analysis, automatic video indexing, content-based video search, lecture speech recognition, lecture slides recognition.
  • Automatic E-Lecture Material Enhancement

Research Team

  • Supervisor: Prof. Dr. Christoph Meinel 
  • Group leader: PD Dr. habil. Haojin Yang
  • Joseph Bethge (PhD student, H-1.21)
  • Ting Hu (PhD student, H-1.22)
  • Ziyun Li (PhD student, H-1.22)
  • Nianhui Guo (PhD student, H-1.21)
  • Jona Otholt (PhD student, H-1.22)
  • Gregor Nickel (PhD student, H-1.21)
  • Zi Yang (co-supervised industry PhD student, Huawei Munich Research Center)
  • Weixing Wang (Master thesis/PhD student, H-1.21)
  • Hong Guo (PhD student, H-1.17)
  • Paul Mattes (Scientific coworker)
  • Christopher Aust (Scientific coworker)
  • Prashant Dangwal (Scientific coworker)
  • Maximilian Schulze (Scientific coworker
  • Philipp Hildebrandt (Scientific coworker)

Former Team Members

  • Dimitri Korsch (former master studentnow PhD Student with Friedrich-Schiller-Universität Jena)
  • Hannes Rantzsch (former master student, now with nexenio GmbH)
  • Tom Herold (former master student, now with scalable minds)
  • Sheng Luo (former PhD Student, now with Nvidia Shanghai)
  • Dr. Cheng Wang (former PhD student, now with Amazon AI)
  • Martin Fritzsche (former master student)
  • Haofang Lu (former PhD student)
  • Dr. Xiaoyin Che (former PhD student and PostDoc researcher, now with Siemens Research China)
  • Larissa Hoffäller (Scientific coworker)
  • Julian Niedermeier (Scientific coworker)
  • Jonathan Sauder (Intern)
  • Dr. Mina Razaei (former PhD student, now with LMU)
  • Dr. Goncalo Mordido (former PhD student, now with MILA-Quebec AI Institute)
  • Benedikt Schenkel (former scientific coworker)
  • Dr. Christian Bartz (former PhD student, now with German Aerospace Center)
  • Hendrik Rätz (former PhD student)
  • Axel Stebner (Scientific coworker)

 

 

Current News

  • New research project (01.01.2023): HPI act as the coordinator for the new BMUV KI-leuchttürme project (2023-2025) and collaborate with machine learning experts from the Technical University of Munich (TUM) and atmospheric and meteorological experts from the GeoForschungsZentrum Potsdam (GFZ).
  • Media report about our BNext paper (05.12.2022): synced AI review (Chinese)
  • Invited Keynote (16.06.2021): PD Dr. Haojin Yang is invited to give a keynote talk on Edge AI at IWDSC@ECCV 2022.
  • Paper accepted at ICPR 2022: Our paper "Synthesis in Style: Semantic Segmentation of Historical Documents using Synthetic Data" has been accepted at ICPR 2022.
  • Invited Tutorial talk (17.01.2022): PD Dr. Haojin Yang is invited to give a tutorial talk on Low-bit Neural Network Computing at 27th Asia and South Pacific Design Automation Conference ASP-DAC 2022.
  • New course on AI campus (12.01.2022): A new MOOC course "Applied Edge AI: Deep Learning Outside of the Cloud" created by PD Dr. Haojin Yang, Joseph Bethge and Dr. Christian Bartz is online.
  • Paper accepted at BMVC 2021: Our paper "One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN" has been accepted at the 32nd British Machine Vision Conference, 2022.
  • New cooperation (04.08.2021) is established! Obtained large-scale computing resource funding from Facebook AI, which will be used to accelerate the research and development of the BITorch project.
  • Research project extended (04.2021) We have successfully extended the two-year research cooperation with WPI. Congratulations!
  • Research project extended (01.2021) We have successfully extended the two-year research cooperation with SAP AG. Congratulations!
  • New cooperation (01.01.2021) is established! We start joint research work in the field of lower-bit AI accelerators with Tsinghua University’s NICSEFC Laboratory.
  • Paper accepted (15.09.2019): our paper "microbatchGAN: Stimulating Diversity with Multi-Adversarial Discrimination" has been accepted by IEEE Winter Conference on Application Computer Vision (WACV’20), Snowmass village, Colorado, March 2-5, 2020
  • Paper accepted (16.08.2019): our paper "BinaryDenseNet: Developing an Architecture for Binary Neural Networks" has been accepted by International Conference on Computer Vision (ICCV'19), Neural Architects'19, Oct. 27- Nov. 2 2019, Seoul, Korea
  • Paper accepted (16.07.2018): our paper "Dropout-GAN: Learning from a Dynamic Ensemble of Discriminators" has been accepted by ACM KDD'18 Deep Learning Day (KDD DLDay 2018), London UK, 2018 
  • Paper accepted (15.03.2018): our paper "Instance Tumor Segmentation using Multitask Convolutional Neural Network" has been accepted by the International Joint Conference on Neural Networks (IJCNN) 2018
  • Paper accepted (01.03.2018): our paper "Whole Heart and Great Vessel Segmentation with Context-aware of Generative Adversarial Networks" has been accepted by Bildverarbeitung für die Medizin (BVM) 2018
  • Paper accepted (08.11.2017): our paper "SEE: Towards Semi-Supervised End-to-End Scene text Recognition" has been accepted by the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)
  • Invited blog post (25.10.2017): BMXNet on Amazon AWS AI Blog
  • Scientific Talk (25.10.2017): Dr.Haojin Yang "BMXNet: an open source binary neural network implementation based on MXNet",ACM Multimedia 2017, Mountain View CA, US (link)
  • Invited Talk (24.08.2017): Dr.Haojin Yang "Multimedia Understanding with Deep Learning", Zheng Zhou University, China (link in chinese)
  • Paper accepted (31.07.2017): our two papers "Language Identification Using Deep Convolutional Recurrent Neural Networks" and "Deep Neural Network with l2-norm Unit for Brain Lesions Detection" have been accepted by ICONIP 2017.
  • Paper accepted (23.07.2017): our paper "BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet" has been accepted by ACM Multimedia 2017.
  • Invited Talk (17.07.2017): Dr. Haojin Yang, Martin Fritzsche "BMXNet: an open source binary neural network based on MXNet", Amazon Development Center, Berlin.
  • Conference Talk (July 3-7, 2017): Xiaoyin Che "Automatic Lecture Subtitle Generation and How It Helps", 17th IEEE International Conference on Advanced Learning Technologies (ICALT 2017), Timisoara, Romania.

Teaching

Concluded PhD Theses

  • Dr. Cheng Wang: "Deep Learning of Multimodal Representations"
  • Dr. Xiaoyin Che: "E-Lecture Material Enhancement Based on Automatic Multimedia Analysis"
  • Dr. Mina Rezaei: "Deep representation Learning from Imbalanced Medical Imaging"
  • Dr. Goncalo Mordido: "Diversification, Compression, and Evaluation Methods for Generative Adversarial Networks"
  • Dr. Christian Barz: "Reducing the Annotation Burden: Deep Learning for Optical Character Recognition using less Manual Annotations"

Current Master Theses

  • Weixing Wang, "Network Intrusion Detection using pre-trained tabular representation models", co-supervision with Prof. Wolfgang Kellerer from TUM
  • Jonas Krah, "Accelerating Monocular Depth Estimation using Binary Neural Networks"
  • Furkan Simsek, "LTGCD: Long-tailed Generalized Category Discovery"

Concluded Master Thesis

  • Tobias Bredow, "Synthetic Data for the Segmentation of Medical Images", 2022
  • Alexander Kromer, "Quantized Ensemble Neural Networks", 2022
  • Erik Ziegler, "Multi-Task and Zero-Shot Learning with Question Answering Transformer Models", 2022
  • Emanuel Metzenthin, "Weakly Supervised Text Localization using Deep Reinforcement Learning", 2022
  • Jona Otholt, "Automatic Categorization of Scanned Documents" 2021
  • Rätz, Hendrik "Handwriting Classification on Archival Documents using Deep Neural Networks", 2020
  • Julian Niedermeier, "Manifold Learning for the Evaluation of Generative Models" 2019
  • Wolff, Felix "Online Activity Prediction with Long-short-term Memory Recurrent Networks" (co-supervision with Prof. Mathias Weske and Dr. Luise Pufahl), 2019
  • Loy, Adrian "Adaptive Precision of Deep Neural Networks", 2019
  • Bornstein, Marvin "Evaluation of Quantized Deep Neural Networks", 2019
  • Meyer, Thorben "Handwriting Detection/Recognition from Art-Historical Documents", 2018
  • Tom Herold: "Language identification in audio files using deep learning", 2017
  • Martin Fritzsche: "Quantized Deep Neural Networks", 2017
  • Hannes Rantzsch: "A deep learning approach to signature verification", 2016
  • Dimitri Korsch: "Perspective recification of scene text with the help of analytical and deep learning approaches", 2016
  • Christian Bartz: "Scene text recognition using deep learning", 2016

Lecture

Summer term

Winter term

Reviewed Publications

In Journal:

   2019

  • Mina Rezaei, Haojin Yang and Christoph Meinel, "Recurrent Generative Adversarial Network for Learning Imbalanced Medical Image Semantic Segmentation", International Journal of Multimedia Tools and Applications (MTAP), DOI: 10.1007/s11042-016-3380-8, online ISSN:1573-7721, Print ISSN:1380-7501, Special Issue: "Deep Learning for Computer-aided Medical Diagnosis", 2019 online version

   2018

  • Cheng Wang, Haojin Yang and Christoph Meinel, "Image Captioning with Deep Bidirectional LSTMs and Multi-Task Learning", ACM Transactions on Multimedia Computing Communications and Applications (TOMM) 2018 [link][PDF][BibTex]

   2017

  • Xiaoyin Che, Haojin Yang, Christoph Meinel, "Automatic Online Lecture Highlighting Based on Multimedia Analysis", IEEE Transactions on Learning Technologies (TLT), Publisher: IEEE Computer Society and IEEE Education Society 2017, Volume: PP, Issue: 99, DOI: 10.1109/TLT.2017.2716372, Print ISSN: 1939-1382 [citation] [PDF]

   2016

  • Cheng Wang, Haojin Yang and Christoph Meinel, "A Deep Semantic Framework for Multimodal Representation Learning", International Journal of MULTIMEDIA TOOLS AND APPLICATIONS (MTAP), DOI: 10.1007/s11042-016-3380-8, online ISSN:1573-7721, Print ISSN:1380-7501,  Special Issue: "Representation Learning for Multimedia Data Understanding", March 2016 [link] [PDF] [BibTex]

   2014

  • Haojin Yang, Christoph Meinel, "Content Based Lecture Video Retrieval Using Speech and Video Text Information", IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES (TLT), DIO: 10.1109/TLT.2014.2307305, online ISSN: 1939-1382, pp. 142-154, volume 7, number 2, April-June 2014, Publisher: IEEE Computer Society and IEEE Education Society [citation BibTex] [PDF]

   2012

In Conference, Workshop and arXiv: : 

2022

  • Guo, N., Bethge, J., Meinel, C., & Yang, H. (2022). Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket. arXiv preprint arXiv:2211.12933. [pdf] [code]
  • Hu, T., Meinel, C., & Yang, H. (2022). Empirical Evaluation of Post-Training Quantization Methods for Language Tasks. arXiv preprint arXiv:2210.16621. [pdf]
  • Li, Z., Wang, X., Meinel, C., Robertson, N. M., Clifton, D. A., & Yang, H. (2022, October). Not all knowledge is created equal: mutual distillation of confident knowledge. In NeurIPS 2022 Workshop on Trustworthy and Socially Responsible Machine Learning
  • Li, Z., Otholt, J., Dai, B., Meinel, C., & Yang, H. (2022). A Closer Look at Novel Class Discovery from the Labeled Set. arXiv preprint arXiv:2209.09120. [pdf] [code coming soon]
  • Bartz, C., Raetz, H., Otholt, J., Meinel, C., & Yang, H. (2022, August). Synthesis in Style: Semantic Segmentation of Historical Documents using Synthetic Data. In 2022 26th International Conference on Pattern Recognition (ICPR) (pp. 3878-3884). IEEE. [code] [pdf]

2021

  • Bartz, C., Bethge, J., Yang, H., & Meinel, C. (2020). One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN. The 32nd British Machine Vision Conference (BMVC), 22nd - 25th November 2021 [pdf][code]
  • Hu, Ting, Haojin Yang, and Christoph Meinel. "Denoising AutoEncoder Based Delete and Generate Approach for Text Style Transfer." International Conference on Artificial Neural Networks. Springer, Cham, 2021. [pdf]
  • N Guo, J Bethge, H Yang, K Zhong, X Ning, C Meinel, Y Wang. (2021). BoolNet: Minimizing The Energy Consumption of Binary Neural Networks. arXiv preprint arXiv:2106.06991 [pdf][code][video]
  • Li, Z., Wang, X., Yang, H., Hu, D., Robertson, N. M., Clifton, D. A., & Meinel, C. (2021). Not All Knowledge Is Created Equal. arXiv preprint arXiv:2106.01489. [pdf][code][video] 
  • H Yang, Z Shen, Y Zhao, AsymmNet: Towards ultralight convolution neural networks using asymmetrical bottlenecks, MAI@CVPR 2021 [pdf][code][video]
  • G Mordido, H Yang, C Meinel, Evaluating Post-Training Compression in GANs using Locality-Sensitive Hashing, arXiv preprint arXiv:2103.11912 [pdf]
  • Bethge, J., Bartz, C., Yang, H., Meinel, C. An Improved Network Architecture for Binary Neural Networks, WACV 2021 [pdf] [code]
  • G. Mordido*, J. Niedermeier*, C. Meinel. Assessing Image and Text Generation with Topological Analysis and Fuzzy Logic. 2021 Winter Conference on Applications of Computer Vision (WACV 2021).

2020

  • Bethge, J., Bartz, C., Yang, H., Meinel, C. (2020). MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?. arXiv preprint arXiv:2001.05936.
  • Bartz, C., Bethge, J., Yang, H., & Meinel, C. (2020). One Model to Reconstruct Them All: A Novel Way to Use the Stochastic Noise in StyleGAN. arXiv preprint arXiv:2010.11113.
  • Bartz, C., Bethge, J., Yang, H., & Meinel, C. (2020). KISS: Keeping It Simple for Scene Text Recognition. arXiv preprint arXiv:1911.08400.
  • G. Mordido, H. Yang and C. Meinel.: microbatchGAN: Stimulating Diversity with Multi-Adversarial Discrimination. In IEEE Winter Conference on Application Computer Vision (WACV’20), Snowmass village, Colorado, March 2-5, 2020
  • J Bethge, C Bartz, H Yang, C Meinel, BMXNet 2: An Open Source Framework for Low-bit Networks-Reproducing, Understanding, Designing and Showcasing. In Proceedings of the 28th ACM International Conference on Multimedia, 2020 [PDF]
  • Jonathan Sauder, Ting Hu, Xiaoyin Che, Gonçalo Mordido, Haojin Yang, Christoph Meinel, Best student forcing: A simple training mechanism in adversarial language generation. In Proceedings of The 12th Language Resources and Evaluation Conference, 2020. [PDF]
  • J. Niedermeier*, G. Mordido* and C. Meinel. Improving the Evaluation of Generative Models with Fuzzy Logic. AAAI 2020 Evaluating Evaluation of AI Systems (AAAI 2020 Meta-Eval)
  • G. Mordido and C. Meinel. Mark-Evaluate: Assessing Language Generation using Population Estimation Methods. Conference on Computational Linguistics (COLING 2020)
  • Bartz, C., Seidel, L., Nguyen, D. H., Bethge, J., Yang, H., & Meinel, C. Synthetic Data for the Analysis of Archival Documents: Handwriting Determination. DICTA2020
  • Bartz, C., Jain, N., & Krestel, R. (2020, May). Automatic Matching of Paintings and Descriptions in Art-Historic Archives using Multimodal Analysis. In Proceedings of the 1st International Workshop on Artificial Intelligence for Historical Image Enrichment and Access (pp. 23-28).
  • Hu, T., & Meinel, C. (2020, September). Text Generation in Discrete Space. In International Conference on Artificial Neural Networks (pp. 721-732). Springer, Cham.

2019

  • Bethge, J., Yang, H., Bornstein, M., & Meinel, C. BinaryDenseNet: Developing an Architecture for Binary Neural Networks. International Conference on Computer Vision (ICCV'19), Neural Architects'19, Oct. 27- Nov. 2 2019, Seoul, Korea (to appear)
  • Bethge, J., Yang, H., Bornstein, M., & Meinel, C. Back to Simplicity: How to Train Accurate BNNs from Scratch?. arXiv preprint arXiv:1906.08637. Demo
  • Joseph Bethge, Haojin Yang, Christoph Meinel, Training Accurate Binary Neural Networks From Scratch, In IEEE International Conference on Image Processing (ICIP'19) in Taipei, Taiwan, September 22-25, 2019
  • Mina Rezaei, Haojin Yang, Konstantine Harmuth, Christoph Meinel: Conditional Generative Adversarial Refinement Networks for Unbalanced Medical Image Semantic Segmentation. In IEEE Winter Conference on Application Computer Vision (WACV’19), pages:1836-1845, Waikoloa Village, HI, USA, January 7-11, 2019 code
  • Mina Rezaei, Haojin Yang, Christoph Meinel: Learning Imbalanced Semantic Segmentation through Cross-Domain Relations of Multi-Agent Generative Adversarial Networks. SPIE Medical Imaging - Computer Aided Diagnosis (SPIE’19), pages 1-6, San Diego, California, United States 16 - 21 February 2019

2018

  • Jonathan Sauder, Xiaoyin Che, Gonçalo Mordido, Haojin Yang and Christoph Meinel. Pseudo-Ground-Truth Training for Adversarial Text Generation with Reinforcement Learning. Deep Reinforcement Learning Workshop at NeurIPS 2018 (Deep RL workshop)
  • Mina Rezaei, Haojin Yang, Christoph Meinel: Recurrent Generative Adversarial Network for Learning Multiple Clinical Tasks. Accepted by Machine Learning for Health Workshop at NeurIPS 2018 (ML4H)
  • Mina Rezaei, Haojin Yang and Christoph Meinel, Generative Adversarial Framework for Learning Multiple Clinical Tasks. Digital Image Computing: Techniques and Applications (DICTA 2018)
  • Christian Bartz, Haojin Yang, Joseph Bethge and Christoph Meinel. LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks​. 1st International Workshop on Advanced Machine Vision for Real-life and Industrially Relevant Applications​" (AMV 2018), in conjunction with the "Asian Conference on Computer Vision" (ACCV) 2-6 December 2018, in Perth, Australia
  • Mina Rezaei, Haojin Yang, Christoph Meinel: voxel-GAN: Adversarial Framework for Learning Imbalanced Brain Tumor Segmentation. Accepted by BrainLes@MICCAI 2018, code)
  • G. Mordido, H. Yang and C. Meinel. Dropout-GAN: Learning from a Dynamic Ensemble of Discriminators. ACM KDD'18 Deep Learning Day (KDD DLDay 2018), London UK, 2018 [PDF]
  • Mina Rezaei, Haojin Yang and Christoph Meinel "Instance Tumor Segmentation using Multitask Convolutional Neural Network" International Joint Conference on Neural Networks (IJCNN) 2018   
  • Mina Rezaei, Haojin Yang, Christoph Meinel "Whole Heart and Great Vessel Segmentation with Context-aware of Generative Adversarial Networks" Bildverarbeitung für die Medizin (BVM) 2018
  • Mina Rezaei, Haojin Yang, Christoph Meinel, "Automatic Cardiac MRI Segmentation via Context-aware Recurrent Generative Adversarial Neural Network", Computer Assisted Radiology and Surgery (CARS 2018)

2017

  • Chrisitian Bartz, Haojin Yang, Christoph Meinel "SEE: Towards Semi-Supervised End-to-End Scene text Recognition", the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), February 2–7, 2018 New Orleans, Lousiana, USA (PDF) (codes)
  • Christian Bartz, Tom Herold, Haojin Yang and Christoph Meinel "Language Identification Using Deep Convolutional Recurrent Neural Networks", 24th International Conference on Neural Information Processing (ICONIP 2017), November 14-18, 2017, Guangzhou, China 
  • Mina Rezaei, Haojin Yang and Christoph Meinel "Deep Neural Network with l2-norm Unit for Brain Lesions Detection", 24th International Conference on Neural Information Processing (ICONIP 2017), November 14-18, 2017, Guangzhou, China
  • Xiaoyin Che, Nico Ring, Willi Raschkowski, Haojin Yang and Christoph Meinel, "Traversal-Free Word Vector Evaluation in Analogy Space", RepEval workshop at EMNLP 17 (Empirical Methods in Natural Language Processing), September 7–11, 2017, Copenhagen, Denmark. [PDF copy]  
  • Haojin Yang, Martin Fritzsche, Christian Bartz, Christoph Meinel, "BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet" ACM International Conference on Multimedia (ACM MM 2017), Open Source Software Competition, October 23-27, 2017, Mountain View, CA USA. [PDF copy][project]
  • Xiaoyin Che, Nico Ring, Willi Raschkowski, Haojin Yang and Christoph Meinel "Automatic Lecture Subtitle Generation and How It Helps", 17th IEEE International Conference on Advanced Learning Technologies (ICALT 2017), July 3-7, 2017, Timisoara, Romania. [PDF copy][BibTex]

   2016

  • Haojin Yang, Cheng Wang, Christian Bartz, Christoph Meinel "SceneTextReg: A Real-Time Video OCR System", ACM international conference on Multimedia (ACM MM 2016), system demonstration session, 15-19 October 2016, Amsterdam, The Netherlands [PDF copy][demo video] [BibTex]
  • Cheng Wang, Haojin Yang, Christian Bartz, Christoph Meinel "Image Captioning with Deep Bidirectional LSTMs", ACM international conference on Multimedia (ACM MM 2016), full paper in the deep learning session of the main conference track, 15-19 October 2016, Amsterdam, The Netherlands [PDF copy] [demo video
  • Xiaoyin Che, Cheng Wang, Haojin Yang and Christoph Meinel, "Punctuation Prediction for Unsegmented Transcript Based on Word Vector", "the 10th International Conference on Language Resources and Evaluation (LREC 2016)", Portorož (Slovenia), 23-28 May 2016 [Dataset]
  • Haojin Yang, "Real-Time Video OCR System", system demonstration at 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Show&Tell session, Shanghai China, 20-25 March 2016
  • Cheng Wang, Haojin Yang and Christoph Meinel, "Exploring Multimodal Video Representation for Action Recognition", the annual International Joint Conference on Neural Networks (IJCNN 2016), Vancouver, Canada, July 24-29, 2016
  • Xiaoyin Che, Thomas Staubitz, Haojin Yang and Christoph Meinel, "Pre-Course Key Segment Analysis of Online Lecture Videos", 16th IEEE International Conference on Advancing Learning Technologies (ICALT-2016), Austin, Texas, USA, July 25-28, 2016
  • Xiaoyin Che, Sheng Luo, Haojin Yang and Christoph Meinel, "Sentence Boundary Detection Based on Parallel Lexical and Acoustic Models", INTERSPEECH 2016, San Francisco, California, USA in September 8-12, 2016 
  • Sheng Luo, Haojin Yang, Cheng Wang, Xiaoyin Che, and Christoph Meinel, "Action Recognition in Surveillance Video Using ConvNets and Motion History Image", International Conference on Artificial Neural Networks (ICANN 2016), Barcelona Spain, 6th-9th of September 2016 
  • Sheng Luo, Haojin Yang, Cheng Wang, Xiaoyin Che and Christoph Meinel, "Real-time action recognition in surveillance videos using ConvNets", in the 23rd International Conference on Neural Information Processing (ICONIP 2016), in Kyoto (Japan), 16th-21th of October 2016
  • Hannes Rantzsch, Haojin Yang and Christoph Meinel "Signature Embedding: Writer Independent Offline Signature Verification with Deep Metric Learning" in 12th International Symposium on Visual Computing (ISVC'16), Las Vegas USA, December 12-14, 2016. [PDF copy] [Poster]
  • Xiaoyin Che, Sheng Luo, Haojin Yang, Christoph Meinel "Sentence-Level Automatic Lecture Highlighting Based on Acoustic Analysis" 16th IEEE International Conference on Computer and Information Technology (IEEE CIT 2016), Shangri-La's Fijian Resort, Fiji, 7-10 December 2016

 

Other Links

... to our Research
              Security Engineering - Learning & Knowledge Tech - Design Thinking - former
... to our Teaching
              Tele-Lectures - MOOCs - Labs - Systems 
... to our Publications
              Books - Journals - Conference-Papers - Patents
... and to our Annual Reports.