Today, proteomics is the key technology for the analysis of proteins. The high-throughput of mass spectrometers and the increasing amount of available data make the development of machine learning methods more appealing everyday. With the vast amount of experimental data and increasingly accurate meta data available, complex machine learning models can be trained to simulate many steps of the mass spectrometric workflow. The trained models and their predictions can then be used to increase the sensitivity of peptide and protein identification. A key challenge is the representation of proteins (and peptides) as input for the machine learning tasks. Similarly, when the affinity of protein-ligand interactions needs to be predicted, a key challenge is how to encode the proteins and small molecules (ligands) as input for machine learning tasks. For the accurate prediction of binding affinities between proteins and their ligands deep learning as emerged as method of choice. Experimenting with various pretrained embeddings, Siamese neural networks and state of the art neural network architectures we aim to increase the precision of in-silico affinity predictions, especially for understudied protein-ligand pairs.
Keywords: Machine Learning, Deep Learning, Proteomics, Protein-Ligand interactions, Python