Hasso-Plattner-Institut
 
    • de
Hasso-Plattner-Institut
Prof. Dr. Emmanuel Müller
 

Lernen. Wissen. Daten. Analysen. (LWDA 2016)

Keynotes

Argument Mining: Manual and automatic annotation of short user-generated texts

In the last few years, argument mining has emerged as a new field that aims to identify argumentative portions in natural language text, and to uncover the structure of the underlying arguments. Domains that have been addressed include legal text, student essays, and customer reviews (as a follow-up step to sentiment analysis). In this talk, I suggest an annotation scheme for argumentation, and present results on automatic analysis of our „argumentative microtext corpus“ - a collection of 115 short texts that have been produced by students in response to a trigger question, which usually bears the form „Should one (not) do X ?“ I give results from a joint-inference approach to this task, present various extensions, and then discuss how the approach scales up to longer text. 

Manfred Stede studied Computer Science and Linguistics at TU Berlin and Edinburgh University, and received an M.Sc. in Computer Science from Purdue University (USA). In 1996, he earned his Ph.D. at the University of Toronto with a thesis on multilingual text generation. From 1995 to 2000, he worked at TU Berlin in the large national „Verbmobil“ project, which built a system for translating spoken language between German, English, and Japanese. After a short interlude at a company in Berlin, he became a professor in Applied Computational Linguistics at Potsdam University in 2001. His research mainly revolves around issues of text structure, ranging from theoretical models to its automatic analysis, with applications in, e.g., text mining and summarization. Recently, a focus of his research is on different dimensions of subjectivity in language, where speakers convey their attitudes, opinions, and arguments. The well-known computational application is Sentiment Analysis, where Stede contributed to a successful system implementing a lexicon-based approach for English. As a follow-up step, he is now interested in Argument Mining, i.e., the automatic discovery of author’s claims, reasons supporting them, and possible objections.

Stede published three monographs, fifteen journal papers, and numerous conference papers and book chapters. He directed research projects funded by various German national agencies and the European Union, sometimes in collaboration with local companies.

Amazon: A Playground for Machine Learning

Within Amazon, a company with over 200 millions of active consumers, over 2 million active seller accounts and over 180.000 employees, there are hundreds of problems which can be tackled with Machine Learning. In the first part of this talk, I will give an overview of a number of Machine Learning applications. I will explain how they fit within the Amazon ecosystem, the challenges we are facing and how they help us scale. While Machine Learning is routinely used in recommendation, fraud detection and ad allocation, it plays a key role in devices such as the Kindle or the Echo, as well as the automation of Kiva enabled fulfilment centres, statistical machine translation and automated Fresh produce inspection. In the second part, I will discuss how we democratize machine learning within the company. Applying complex predictive systems, such as machine learning-based systems, in the wild requires to manually tune and adjust knobs, broadly referred to as system parameters or hyperparameters. Black-box optimisation and in particular Bayesian optimisation provides a natural framework for addressing this problem by taking the human expert out of the fine tuning loop. I will introduce Bayesian optimization and discuss open problems in this area.

Cedric Archambeau is a Senior Machine Learning Scientist with Amazon, Berlin. He manages the algorithms team and served as a technical advisor to Sebastian Gunningham, Amazon Senior Vice President Seller Services. Recently, his team delivered the learning algorithms offered in Amazon Machine Learning (aws.amazon.com/machine-learning). He is interested in large scale probabilistic inference and Bayesian optimization. He holds a visiting position in the Centre for Computational Statistics and Machine Learning at University College London. Prior to joining Amazon, he was leading the Machine Learning and Mechanism Design area at Xerox Research Centre Europe, Grenoble.

Algorithm Engineering for Graph Clustering

Graph clustering has become a central tool for the analysis of networks in general, with manifold applications e.g. in data mining, social networks, biology or complex systems. The general aim of graph clustering is to identify dense groups in networks. Countless formalizations thereof exist, among those the widespread measure modularity. However, the overwhelming majority of algorithms for graph clustering relies on heuristics, e.g., for some NP-hard optimization problem, and do not allow for any structural guarantee on their output. Moreover, networks in the real world are often large, evolve over time or come as a data stream.

The talk will discuss algorithmic aspects of graph clustering, especially quality measures and algorithms that are based on the intuition of identifying as clusters dense subgraphs that are loosely connected among one another. We will focus on the algorithm engineering methodology which consists in a cycle of design, analysis, implementation, and experimental evaluation of algorithms, bridging the gap between algorithm theory and practical applications. Special emphasis will be on clustering large networks.

 

Dorothea Wagner is a full professor for Informatics at the Karlsruhe Institute of Technology (KIT). Her research interests include design and analysis of algorithms and algorithm engineering, graph algorithms, computational geometry and discrete optimization, particularly applied to transportation systems, energy systems, network analysis, data mining and visualization.

Among other activities she is member of the German Council of Science and Humanities (Wissenschaftsrat. From 2007 to 2014 she was vice president of the DFG (Deutsche Forschungsgemeinschaft - German Research Foundation) and 2004 to 2013 speaker of the scientific advisory board of Dagstuhl - Leibniz Center for Informatics. In 2012 she received a Google Focused Research Award, she is member of Academia Europaea and Fellow of the GI (Gesellschaft für Informatik).

Dorothea Wagner obtained her diploma and Ph.D. degrees from the RWTH Aachen in 1983 and 1986 respectively; and 1992 the Habilitation degree from the TU Berlin. 1994 - 2003 she was a full professor at the University of Konstanz.

Intelligent Enterprise

Machine Learning describes algorithms that can learn from experience without having to be explicitly programmed. Improved processing power, better algorithms and the availability of big data are the foundation and the reason why Machine Learning is going to take enterprise software to a new level now. We see tremendous potential for our customers. The world’s relevant enterprises rely on SAP. By using Machine Learning we are going to leverage their data for them and let our customers focus on the real job to be done.

Dr. Sebastian Wieczorek is Director at the SAP Innovation Center Network. In this role he is responsible for the development of SAP’s Machine Learning Platform. He is also serving as an academic expert and reviewer for the European Commission and the German Ministry of Education and Research. In previous positions at SAP, Sebastian was coordinating all startup engagement activities of the SAP Innovation Center Network, managed EU-funded research projects, was product owner for SAP Web Analytics, team lead for the "Application Engineering Group" at SAP Research and futurist for the Development Experience team, working from Germany, Israel and the US. Besides working for SAP, Sebastian held lectures on Web Engineering at the Technical University of Darmstadt and obtained a PhD from the Technical University of Berlin (Germany) under supervision of Prof. Ina Schieferdecker. Before joining SAP, Sebastian worked as a Software Developer in Moscow (Russia) and studied computer sciences at the Technical University of Dresden (Germany) and the Northumbria University at Newcastle (United Kingdom).

Web-Scale Domain-Specific Information Extraction

Information Extraction (IE) from unstructured texts is a technology with growing importance in many applications. Three important challenges to IE are the achievement of high quality results, scalability of methods to very large corpora, and integration of IE results with other data for downstream analysis. In this talk, we will highlight recent advances and open questions in these areas by drawing from extensive experiences in developing and applying IE for biomedical research.

Ulf Leser studied computer science at the Technische Universität München and obtained his PhD in Data Integration and Query Planning from Technische Universität Berlin. After positions in research institutes and in the private sector, in 2002 he became a professor for Knowledge Management in Bioinformatics at Humboldt-Universität zu Berlin. His research focuses on scientific data management, statistical Bioinformatics, biomedical text mining and infrastructures for large-scale biomedical analysis and is typically carried out in interdisciplinary projects with domain scientists, especially from Medicine and Biology. He is speaker of the DFG-funded graduate school "SOAMED - Service-oriented architectures for medical applications", chairman of the coordinated BMBF project "PREDICT - Comprehensive Data Integration for Personalized Ontology", PI of the DFG research unit Stratosphere, and a board member of the DFG-excellence graduate school "BSIO - Berlin School for Integrative Oncology".