Our paper InferDB: In-Database Machine Learning Inference Using Indexes, by Ricardo Salazar Díaz, Boris Glavic, and Tilmann Rabl was accepted at VLDB '24.
The performance of inference with machine learning (ML) models and its integration with analytical query processing have become critical bottlenecks for data analysis in many organizations. An ML inference pipeline typically consists of a preprocessing workflow followed by prediction with an ML model. Current approaches for in-database inference implement preprocessing operators and ML algorithms in the database either natively, by transpiling code to SQL, or by executing user-defined functions in guest languages such as Python. In this work, we present a radically different approach that approximates an end-to-end inference pipeline (preprocessing plus prediction) using a light-weight embedding that discretizes a carefully selected subset of the input features and an index that maps data points in the embedding space to aggregated predictions of an ML model. We replace a complex preprocessing workflow and model-based inference with a simple feature transformation and an index lookup. Our framework improves inference latency by several orders of magnitude while maintaining similar prediction accuracy compared to the pipeline it approximates.