Big Data Machine Learning: Telco Fraud Detection Points the Way

Print Friendly, PDF & Email

padraigstapleton-jpgIn this special guest feature, Padraig Stapleton, VP of Engineering at Argyle Data discusses how the use of machine learning can help mobile operators like Verizon, BT, and others use big data and machine learning technologies to help detect and stop, cyber gangs in real-time, from stealing $billions via global mobile fraud. Padraig Stapleton brings years of industry-leading management and technical expertise across a number of areas including mobile telecommunications and big data. Most recently he was VP of Engineering and Operations for the Big Data group in AT&T responsible for development of their big data platform. Previously to that he was involved in a number of successful startups as VP of Engineering building development teams and delivering innovative products to the market place. Padraig has held senior leadership roles in various companies including Telephia, which was acquired by Nielsen, and InterWave Communications.

Invisible and instant:  those are the two key characteristics of mobile fraud today. International crime rings are making millions of dollars by using highly sophisticated scams across multiple geographies, and disappearing before the operator knows the attack is happening. Most major attacks today are ‘fraud cocktails’: unpredictable mixtures of several fraud types, striking with unprecedented volume and velocity.

Mobile operators responding to the Communications Fraud Control Association’s 2015 survey estimated that they lost at least US $38 billion to fraud. More worryingly, the majority said they believed even more fraud was getting past their defenses but could not pinpoint what it was or how it happened.

The mobile industry is beginning to acknowledge a need for detection methods that are able to adapt to the fast pace of evolving network crime and usage patterns. It has become increasingly evident that traditional, rules-based systems with pre-set thresholds are no longer the answer. Mobile crimes morph far more rapidly than analysts can write rules. To write a rule, a fraud analyst needs to know about the fraud type. It can take days or weeks to analyze a previously unknown attack, during which time huge revenue drains continue to occur.

A combined big data and machine learning approach is proving one of the most promising of the new wave of solutions. Already in use in major service provider networks, the machine learning strategy is proving exponentially more effective in fighting fraud, delivering 350% better results than rules-based systems, and allowing analysts to shut down attacks in instants rather than hours or days. The key is to catch them early, and stop them fast.

It works by applying unsupervised machine learning at massive scale to huge lakes of operator and customer data, to determine the characteristics of normal traffic.  Anomalies are identified instantly and the results presented as alerts and in visual representations to analysts, in real time. Fraud specialists can easily determine if an attack is happening. Not all anomalies are fraud, but ALL fraud is anomalous. It’s usually associated with high bursts of activity and/or long call times.

A cyber gang can set up, go to work, and disappear in 24 hours or less—before an operator knows the attack is happening. If thousands of users in Mexico start calling a number in Cuba or Latvia, it’s very likely that a Wangiri, or call-back premium rate scam is happening. Gangs also take advantage of the incentive plans that operators use to generate revenue from business travelers, inflating traffic, dramatically overusing the plans, or driving traffic to other international revenue share scams.

Lately, crime rings have started launching data-based attacks on services like WhatsApp. Even traditional Wangiri attacks are morphing. Instead of a simple missed call tempting the user to call back, at cost, criminals are using social engineering messages encouraging victims to call a premium rate number – for example: “Trying to deliver flowers to your wife. Call to confirm delivery time.”  Analysts need tools that help them detect emerging, morphing fraud.  Machine learning does this.

Machine learning and visualizations are highly complementary. Humans often want to see visualizations that convey data for extra understanding and insight into the fraud technique or method. Anomaly detection visualizations show outliers but lose useful context. When you combine the two you have a very powerful detection and analysis approach.

This can be extended to any area where data analytics are used. Fraud is a natural first deployment for big data machine learning applications, since the ROI is easily and immediately proven, providing a model for other usage scenarios. More and more fraud will occur on the data network in future, therefore gaining visibility into the characteristics of data usage will be paramount.  Due to the vast amount of data flowing across telecoms networks, big data analytics capabilities are essential. Key to operators’ success in this area will be the ability to tap into an enterprise data lake.

Machine learning analytics can be applied across any industry or organization to detect issues ranging from mechanical breakdown to cyber threats.  This will be vital as IoT devices proliferate. Beyond IOT, data-driven applications will change the face of most industries on the planet. Telecommunications companies are the foremost users of big data and their experience with machine intelligence and machine learning will forge a path for other industries attempting to harness information to protect and grow their businesses.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind