How machine learning protects from phishing, mobile threats, and plant breakdowns
By Industry Contributor 17 June 2021 | Categories: feature articlesBy Amir Kanaan, Managing Director for Middle East, Turkey and Africa at Kaspersky
Since the 1950s, scientists have been actively studying the capabilities of computer intelligence. Over the last 70 years, machine learning (ML) has developed from a theoretical concept to a technology actively used in the wild – from the recommendation engine in Netflix to self-driving Tesla cars and from speech recognition in Google Translator to the sales driver in Salesforce. The key benefit of using ML is that it gives the programme autonomy for decision-making, which reduces the amount of manual work for people.
Machine learning is actively used in cybersecurity too – to boost and automate malware detection, among other things. In this article, we share some of the most interesting ML techniques for cyber protection.
Machine learning against advanced email phishing
A sophisticated, accurately prepared phishing letter can be an effective way to trick a specific organisation or a user with malicious purposes. Attackers disguise their messages as emails from new online services, exploiting popular events, and even the Coronavirus pandemic. In the first quarter of 2020, there were many emails circulated with a request to transfer money to help combat COVID-19. Through the business email compromise (BEC) technique, criminals gain employees’ trust via email correspondence. They disguise themselves as a third party, contractor or even a colleague and make targets do what they, the criminals, want.
To protect users from such tricky attacks, a security solution should quickly analyse all parameters of the email, including the content and technical characteristics, to detect if it’s better not to open it. Machine learning can take care of this.
In this case, there should be two ML models. One model will automatically analyse the technical parameters of emails (such as technical headers). It trains on hundreds of millions of metadata records from real emails and learns to recognise the combinations of technical traces that prove that the email is malicious. But this is not enough to make a verdict.
The second model detects the malicious nature of an email based on its content. To achieve the desired emotional effect, attackers use emotive language, as well as a clear call to action (e.g. “your parcel couldn't be delivered, update your data here”) in their text. The model recognises such words and phrases typical for phishing letters.
The two models then correlate both results and make the final verdict – this letter is phishing - saving the user from opening it.
Machine learning against mobile threats for Android
In 2020, Kaspersky researchers detected an increase of two million more mobile threats than in 2019, totaling more than five million overall. One of the key tasks within mobile protection is to secure against unknown malicious objects which have recently appeared in the wild.
On iOS devices, the installation of apps for a wide audience is only possible from the App Store, which is strictly moderated by Apple. On Android devices, users can install apps from a variety of sources and app markets. Unfortunately, cybercriminals sometimes exploit this by posting malware in apps disguised as games, useful software, porn, and so on. And in order to detect the threats effectively and quickly, ML is needed.
An ML agent on a user’s device scans every app as it is being downloaded for specific features, such as required access permissions or numbers and sizes of internal structures. The metadata is sent to the cloud-based ML model that then decides if this set of parameters causes the app to be classified as malicious or not. The model then sends a response indicating whether the file is malicious or not, and the protection product on the device decides to block the app’s download and installation.
This ML analysis requires a lot of computing resources, much more than a mobile device has available, that’s why the process is performed in the cloud.
Machine learning to prevent against plant breakdowns
Equipment malfunctions, misconfigurations, human error, or hacker attacks can all cause the breakdown of industrial machinery. If any of those happen, it’s better to detect the deviation in production processes as soon as possible. Otherwise, an incident can get out of control, leading to, at best, downtime, or at worst, an accident.
The problem is that the early symptoms of an incident are virtually impossible to detect by threshold monitoring, or human operators. When thousands of telemetry readings come in every second, even an experienced operator is only able to focus on a few patterns and overlook the rest.
This is where machine learning for anomaly detection (MLAD) comes in. The neural network is able to analyse a massive amount of telemetry data, absorb all aspects of the machine’s operation and thoroughly learn how the machine behaves under the normal conditions – such as how the signals change over time and how they correlate with each other.
When training of the ML model is complete, the model switches to anomaly detection mode. It then receives telemetry in real-time and, if the divergence between the model and the observation rises above a certain threshold, the machine’s behaviour is deemed anomalous, and an alarm is raised. The model gives an early warning of attacks, malfunctions, or mismanagement before any other instrument can spot the problem. This way, it helps to minimise damage and prevent a plant’s breakdown.
Machine learning against advanced cyberattacks
In some cases, machine learning techniques can be used to complement human intelligence against advanced threats – such as in managed detection and response (MDR) services.
Within an MDR service, an external security operation center (SOC) helps business customers respond to advanced cyberattacks. It receives alerts from the customer’s endpoints and investigates them to find traces of attacks, which it then reports back to the customer with response actions. SOC experts analyse some threat samples manually, but given the scale, they physically cannot look at each and every alert.
Machine learning can take on this burden. It automatically filters out alerts of no interest for SOC analysts, sets alert importance levels and gives hints for analysis. This saves their capacity and minimises the mean time to respond.
During the training mode, the model analyses alerts, and scores them. The higher the score, the greater the probability that the alert should be reviewed by experts. Alerts with scores above a certain threshold are sent to SOC analysts who label them manually and enrich training data for the ML model.
In combat mode, the model resolves some alerts and prioritises the rest for manual processing: the most important – those with the highest score – are put at the head of the queue for processing. This queue strategy reduces the average processing time of alerts and allows the offering to deliver the best SLA.
These are just a few interesting cases of how machine learning serves cybersecurity goals. But we at Kaspersky believe that the field of use will continue to expand. Developing ML techniques in products is one of the priorities for our R&D team because this can make cyber protection more intelligent, faster and efficient.
Most Read Articles
Have Your Say
What new tech or developments are you most anticipating this year?