Azodi, Amir; Gawron, Marian; Sapegin, Andrey; Cheng, Feng; Meinel., Christoph
Proceedings of the International Conference on Mobile, Secure and Programmable Networking (MSPN'15)
Modern machine learning techniques have been applied to many aspects of network analytics in order to discover patterns that can clarify or better demonstrate the behavior of users and systems within a given network. Often the information to be processed has to be converted to a different type in order for machine learning algorithms to be able to process them. To accurately process the information generated by systems within a network, the true intention and meaning behind the information must be observed. In this paper we propose different approaches for mapping network information such as IP addresses to integer values that attempts to keep the relation present in the original format of the information intact. With one exception, all of the proposed mappings result in (at most) 64 bit long outputs in order to allow atomic operations using CPUs with 64 bit registers. The mapping output size is restricted in the interest of performance. Additionally we demonstrate the benefits of the new mappings for one specific machine learning algorithm (k-means) and compare the algorithm's results for datasets with and without the proposed transformations.