Data scientists achieve more accurate predictions than ever, even in real-time. Their productivity is fueled by the exponential evolution of hardware and the surge of machine learning tools, which hide the complexities of hardware. However, there is a performance gap between the software tools and modern hardware. In some cases, a modern hardware infrastructure, 10X the cost of prior generation, does not even halve the time to reach an accurate deep learning model. Such poor hardware utilization makes the current surge of data science unsustainable. In this talk, I will talk about our preliminary work analyzing the behavior of deep learning training on modern CPU-GPU co-processors of different generations and price-points, and our direction to fix the issue of hardware under-utilization.