Data arising from business transactions, scientific measurements and other forms of content-creation calls for automatic data mining and pattern recognition techniques that allow us to efficiently make sense of this data. At the same time these techniques should be able to handle uncertainty, as data from measurements may be imprecise and user-generated content may be unreliable. This lecture will introduce the main concepts of data mining and machine learning, ranging from basic probability and information theory to popular classification, clustering, and regression algorithms.
Prerequisites
Although we will review basic probability theory and statistics, prior knowledge in these areas are useful. Further, we will make heavy use of linear algebra and a fundamental understanding thereof is necessary.