If You Are Relying Mostly On Hadoop to Meet Your Regulatory Compliance Needs, Your Company Could Be in Trouble

Print Friendly, PDF & Email

In this special guest feature, Stefan Pracht, Senior Vice President, Product Marketing, Axellio, says that although Hadoop may be a strong solution for the big-and-fast data issue, it is becoming constrained by conventional server designs in a world in which institutions are having to deal with billions of records per day. Stefan is an experienced marketing executive with over 20 years’ experience in managing and marketing products and services to Enterprise IT and Service Provider markets globally. He has extensive network background, managing product management and marketing teams responsible for network and application monitoring and analysis solutions in Agilent, Fluke Networks, NETSCOUT Systems, and NetAlly, working in Germany, Canada, UK, and USA.

Regulators in the US and Europe over the past decade have imposed massive fines on financial institutions for various regulatory-related violations. The importance of implementing big data-driven approaches to improve regulatory compliance is becoming more apparent.  Existing rules demand an ever-increasing quantity of data in order to handle the complexity of today’s financial system.

Big data has become a critical component of ensuring that laws are followed, assets are protected, and consumers are served in ways that they may not have expected.  Typically, tools based on Apache Hadoop are used to collect and analyze financial and regulatory data. However, when the data becomes too unwieldy – a common occurrence in today’s data-driven environment – this approach can result in server farms that are not designed to handle the immense amount of processing power required.

Hadoop’s strength has been its ability to use low-cost servers by partitioning data and distributing it over an almost infinite number of devices for backup and parallel processing.

While this method works well for data with modest input levels, when there is a torrent of data streaming into the data lake the approach of utilizing modest, low-speed, high-capacity nodes compromises the financial benefit that Hadoop was intended to provide in the first place.  This may significantly increase the expense of complying with rules for even the financial institutions, much alone the effect on smaller firms.

An Alternative

Compared to possible high costs and fines for mistakes and oversights, big data analytics are clearly more financially advantageous in the long term. The idea is to build the Hadoop solution on a platform that is unrestricted by the constraints imposed by conventional Hadoop methods, significantly decreasing the number of nodes needed. 

One alternative:  Consider a server platform that utilizes high-density, all-NVMe SSD-based appliances that can outperform conventional server designs.  When correctly developed and deployed, such a set up can enable internal data transfer speeds of up to 60GBytes/sec.  When combined with Hadoop, it can transform into a platform capable of consuming high-velocity data using a fraction of the servers needed for an equivalent Hadoop installation, reducing operating costs and complexity.

A Proof of Concept Lab in Colorado Springs, CO, in collaboration with a large financial institution, tested such a configuration – and the results were extremely positive. Through this testing, the team was able to run many general Hadoop benchmarks provided in the HortonWorks Hadoop distribution.   When compared to a typical reference implementation, this novel Hadoop solution obtained superior results.    

The capacity to ingest and analyze high-velocity data into the Hadoop packet lake was critical for significantly lowering the number of servers required to meet regulatory standards in this specific Hadoop use case.

Compliance with regulatory requirements in today’s data-driven world will only grow more difficult and costly with conventional server methods.  Hadoop systems have traditionally been used to facilitate searches over huge amounts of data in order to generate compliance reports, conduct regulatory stress testing, and identify fraud.  Since laws have changed, though, it is no longer sufficient to do a simple search of relatively unchanging data.

Modern financial institutions now have enormous amounts of data that must flow into the data lake while doing predictive analytics for a variety of purposes, including compliance, cash management, obtaining a 360-degree picture of the client, and trading analytics.

Although Hadoop may be a strong solution for the big-and-fast data issue, it is becoming constrained by conventional server designs in a world in which institutions are having to deal with billions of records per day.

Organizations may significantly reduce the expenses and equipment required to conduct these critical activities by using a server platform that utilizes a high-density, all-NVMe SSD-based appliance as the basis of the solution.  It should be considered as a viable alternative to a Hadoop-only solution architecture.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Speak Your Mind