4 Strategies for Finding and Interpreting Data Anomalies

Print Friendly, PDF & Email

MariusMoscoviciIn this special guest feature, Marius Moscovici, founder and CEO of Metric Insights, explores data anomalies and what they mean for business intelligence. He also provides business leaders with four strategies for how to find and interpret their data anomalies. Marius founded the company in 2010 to transform the way business intelligence is performed so organizations of any size can quickly and easily deploy powerful analytics. Marius has more than 20 years of experience in analytics and data warehousing and was previously the co-founder and CEO of Integral Results, a leading business intelligence consulting company that was acquired by Idea Integration.

When people think of a data anomaly, they often think of an error — a random blip outside of the normal scope of things that can be considered but discarded. A data anomaly, to many, is little more than a data defect.

In the world of business data intelligence, however, this view is not only usually wrong, but in many cases, it can also be damaging. A data anomaly is often much more than a blip — it’s a signal.

Take, for instance, a typical computer network. A data anomaly in this case would be a flurry of activity that falls well outside what is considered normal activity. For example, a contractor logging in repeatedly at 3 a.m. and attempting to access other company web properties that haven’t been authorized is more than odd. It could lead to an early detection of a possible security breach, preventing some serious headaches.

Or, concerning sales, if a company sells 500 widgets from February to November, 5,000 in December, and 50 in January, then that company can use those differences to understand why it sells more or less during each month.

A data anomaly in these cases is not a data defect, but a path to better understanding behavior — either of employees or of key performance indicators. An anomaly may seem infinitesimal, but if an important one goes unnoticed, it could have serious ramifications.

Separating the Anomalies

Unfortunately, even in companies with data intelligence solutions, anomalies don’t always make themselves known. Or, if they do, they do so at such a constant rate that alert fatigue begins to set in, and the alerts to anomalies become numbing instead of edifying.

To begin analyzing and interpreting anomalies in a way that’s actually useful, you first need to have the software that can identify anomalies, recognize what is a useful anomaly and what isn’t, and point them out at the right time. Not all software is that smart.

Once you do have the software with the right capabilities in place, you can begin to go about identifying useful data anomalies. Here are four ways to recognize them:

  1. Create alerts. Combine an alert system with data analytics. This adds a layer of algorithmic learning so you’re not just analyzing normal trends, but you’re also informed when important things change.
  2. Profile normal behavior. It stands to reason that to know which behavior is abnormal, you need to know what is normal. Otherwise, you end up with a lot of noise and no way to find the signal. You should have data scientists do this with data, of course, but you should also have a sense of what specifically defines “normal” in your own working life.
  3. Include anomaly detection in every aspect of research. This shouldn’t just be restricted to areas you deem important. By researching anomalies across the board, your data scientists can better determine which anomalies are useful and which are just blips.
  4. Make sure your data scientists are on the lookout. Good data intelligence software can do a lot of the work, but it still has a ways to go before the process can be fully automated. Ensure that your data scientists are vigilant when it comes to data anomalies and that their research methods are sound.

In business, a data anomaly is often more than just an accident. It can inform you about important changes or new trends on the horizon, or it could even act as an early warning sign. As long as you know how to detect it, and how to separate the signal from the noise, you can put these little pieces of data to big use.

 

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind

*