I was nearly finished with a 20 page paper (of sorts) on searching for bots in computer networks when I took a break and scanned the contents of my RSS feeds. This struck me as particularly timely and funny:
As I told my boss last week I was disappointed in the algorithms used in what is considered “state of the art” tools. I actually found a strong inverse correlation in the “scoring” of network traffic of highly suspicious traffic compared to clearly normal traffic. The higher scoring traffic should indicate high probability of the traffic being communication with a Command and Control Server (C2 Server) and lower scores with normal traffic. I easily found instances where just the opposite was true.
When I used synthesized data I could get the expected scoring results but real world data demands new detection algorithms. It looks to me like bot builders also do research. Existing algorithms appear to be essentially garbage.
Great illustration. I’m borrowing it.
Scoring requires a system. It’s a fundamental truth that all systems will be gamed.
There’s hundreds more on XKCD. I don’t care for Randall’s politics, but he does get the math right.
It seems to work if you use. Media = garbage.