By Neelesh Kripalani
Artificial Intelligence (AI) could be one of the most disruptive technologies the world has seen in decades. Virtually every industry can benefit from AI use and its adoption rate reflects widespread confidence in its potential. Further, as preventing ransomware has become a priority for many organisations, they are turning to AI as their defence mechanism. However, like any other tech, AI is a two-sided coin. Threat actors are also turning to AI and ML to launch their attacks. There’s one huge problem that threatens to undermine the application of AI and allows adversaries to bypass the digital fortress undetected. It is called poisoned AI or data poisoning.
What is data poisoning?
Machine learning (MI) is believed to be a subset of artificial intelligence. Data poisoning targets the MI aspect of the process. It is a form of manipulation that involves corrupting the information used to train machines. Simply put, data poisoning exploits training data to mislead MI algorithms.
How does it work?
Computers can be trained to correctly categorise information from reams of data. For instance, a computer might be fed 1,000 images of various animals that are labelled by species and breed before it is tasked with recognising an image as a dog. A system may not have seen a picture of a dog, but given enough examples of different animals, it should be able to recognise a dog’s image. In cybersecurity, the same approach is used. An accurate prediction requires a huge number of samples that are correctly labelled. It is said that as even the biggest cybersecurity firms can collate limited data, they crowdsource the data. This increases the diversity of the sample and the chances of detecting malware. But there is a risk with this approach as professional hackers can manipulate such data by labelling it incorrectly.
Threat actors carefully craft a malicious code that labels bad samples as good ones, and then add these samples to a larger batch of data. This helps the hackers to trick the AI/ML into surmising that a snippet of software that resembles a bad example is harmless. Such tampering with data used to train machines provides a virtually untraceable way to circumvent AI-powered defences.
How to combat poisoned AI?
To stay safe, organisations need to ensure that their data is as clean as possible, which means regularly checking that all the labels being put into machines are accurate. Additionally, scientists who develop AI models should regularly check that all the labels in their training data are accurate. Some cybersecurity experts have also suggested adding a second layer of AI and ML algorithms to pinpoint errors in data training. Further, when dealing with AI, sample size is very important. However, companies must train their systems with fewer samples to make sure all the data is clean.
The global market for AI cybersecurity is expected to triple by 2028 to $35 bn. But AI is not omnipotent. Hackers are always looking for their next chance. Thus, one should always be proactive in detecting such cyber risks.
(The writer is chief technology officer, Clover Infotech.)