We try to keep an up to date list of all our publications. If you are interested in a PDF that we have not uploaded yet, feel free to send us an email to get a copy. You can view all publications of the current members of the Artificial Intelligence and Sustainability group. For other listings, please see:
Knowledge about local invariances with respect to given pattern transformations can greatly improve the accuracy of classification. Previous approaches are either based on regularisation or on the generation of virtual (transformed) examples. We develop a new framework for learning linear classifiers under known transformations based on semidefinite programming. We present a new learning algorithm - the Semidefinite Programming Machine (SDPM) - which is able to find a maximum margin hyperplane when the training examples are polynomial trajectories instead of single points. The solution is found to be sparse in dual variables and allows to identify those points on the trajectory with minimal real-valued output as virtual support vectors. Extensions to segments of trajectories, to more than one transformation parameter, and to learning with kernels are discussed. In experiments we use a Taylor expansion to locally approximate rotational invariance in pixel images from USPS and find improvements over known methods.
Semi-Definite Programming by Perceptron Learning. Graepel, Thore; Herbrich, Ralf; Kharechko, Andriy; Shawe-Taylor, John (2003). 457–464.
We present a modified version of the perceptron learning algorithm (PLA) which solves semidefinite programs (SDPs) in polynomial time. The algorithm is based on the following three observations: (i) Semidefinite programs are linear programs with infinitely many (linear) constraints; (ii) every linear program can be solved by a sequence of constraint satisfaction problems with linear constraints; (iii) in general, the perceptron learning algorithm solves a constraint satisfaction problem with linear constraints in finitely many updates. Combining the PLA with a probabilistic rescaling algorithm (which, on average, increases the size of the feasable region) results in a probabilistic algorithm for solving SDPs that runs in polynomial time. We present preliminary results which demonstrate that the algorithm works, but is not competitive with state-of-the-art interior point methods.
The Kernel Mutual Information. Gretton, Arthur; Herbrich, Ralf; Smola, Alexander (2003). 880–883.
We introduce a new contrast function, the kernel mutual information (KMI), to measure the degree of independence of continuous random variables. This contrast function provides an approximate upper bound on the mutual information, as measured near independence, and is based on a kernel density estimate of the mutual information between a discretised approximation of the continuous random variables. We show that Bach and Jordan's kernel generalised variance (KGV) is also an upper bound on the same kernel density estimate, but is looser. Finally, we suggest that the addition of a regularising term in the KGV causes it to approach the KMI, which motivates the introduction of this regularisation.
Online Bayes Point Machines. Harrington, Edward; Herbrich, Ralf; Kivinen, Jyrki; Platt, John C; Williamson, Robert C (2003). 241–252.
We present a new and simple algorithm for learning large margin classifiers that works in a truly online manner. The algorithm generates a linear classifier by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A random subsample of the incoming data stream is used to ensure diversity in the perceptron solutions. We experimentally study the algorithm's performance on online and batch learning settings. The online experiments showed that our algorithm produces a low prediction error on the training sequence and tracks the presence of concept drift. On the batch problems its performance is comparable to the maximum margin algorithm which explicitly maximises the margin.
Introduction to the Special Issue on Learning Theory. Herbrich, Ralf; Graepel, Thore in Journal of Machine Learning Research (2003). 4 755–757.
Our research group investigates both the use of energy in developing artificial intelligence (AI) as well as the use of AI in generating, storing and managing energy. This includes research into energy-efficient algorithms for solving basic AI tasks such as classification, ranking or planning & search, as well as the development and application of AI methods to refined modeling of batteries in order to extend their working lifetime, and the control of domestic energy consumption.