Probabilistic Pentesting

Pentesting tools like Metasploit, Burp, ExploitPack, BeEF, etc. are used by security practitioners to identify possible vulnerability points and to assess compliance with security policies.  The process of setting up a pentesting environment scan be time consuming and cumbersome.

We explore the application of partially observable Markov decision processes (POMDP) to this domain. 

Honeypot Turing Test

The honeypot is a method of cybersecurity in which a bait (‘honey’) system/network is designed to emulate or act as a real system/network to divert malicious attacks upon the actual real system/network. 

Machine learning holds the promise of realistically simulating protocols in a way that fools the attacker but does not compromise the system.

Detecting Money Laundering

Financial institutions have a regulatory requirement to monitor account activity for anti-money laundering (AML). Regulators take the monitoring and reporting requirements very seriously as evidenced by a recent set of FinCEN fines.  

One challenge with AML is that it rarely manifests as the activity of a single person, business, account, or a transaction. Therefore detection requires behavioral pattern analysis of transactions occurring over time and involving a set of (not obviously) related real-world entities.

Deep Learning Approach to Fraud

When creating a feature space for adversarial use cases like payment fraud, account takeover fraud and internal fraud, data scientists can rely on domain knowledge, intuition, personal experience and ultimately and if labeled data is available-variable selection.

Often the objective of constructing such feature spaces is to do anomaly / outlier detection by capturing enough attributes and aggregates that can delineate normal and extraordinary user behavior.

Applying ML to InfoSec

There seems to be very little overlap currently between the worlds of infosec and machine learning. If a data scientist attended Black Hat and a network security expert went to NIPS, they would be equally at a loss. 

This is unfortunate because infosec can definitely benefit from a probabilistic approach but a significant amount of domain expertise is required in order to apply ML methods.

Formulation of Adversarial Machine Learning

Machine learning is being used in a variety of domains to restrict or prevent undesirable behaviors by hackers, fraudsters and even ordinary users.  Algorithms deployed for fraud prevention, network security, anti-money laundering belong to the broad area of adversarial machine learning where instead of ML trying to learn the patterns of benevolent nature, it is confronted with a malicious adversary that is looking for opportunities to exploit loopholes and weaknesses for personal gain.