A Functional Approach to Scanner Detection


Detecting scanning in Internet traffic is a well-studied topic with no single, definitive approach. Among the proposed methods are two which are widely accepted, but with known limitations: one based on a static fanout ratio, and another on principal component analysis (PCA). We introduce a two-step procedure based on Functional PCA and k-means clustering which we argue provides significantly better robustness and data-driven applicability. We validate and compare using synthetic datasets with ‘‘ground truth’’ about anomalies on FTP and HTTP port traffic flows; our method identifies all scanners. We also compare approaches using NTP flow data prior to a reflective DDoS attack in 2014, providing a real-world example to illustrate the deficiencies of existing approaches and how they are addressed by our functional framework procedure. Lastly, we discuss insights into the traffic that cannot be obtained by the previous methods.

Proceedings of the Asian Internet Engineering Conference (AINTEC ‘17), Bangkok, Thailand.