Credit Scoring and Back Trading/Testing

Print Friendly, PDF & Email

This article is the third in an editorial series that has the goal to provide direction for enterprise thought leaders on ways of leveraging big data technologies in support of analytics proficiencies designed to work more independently and effectively in today’s climate of working to increase the value of corporate data assets.

Last week’s article explored the benefits that the retail banking industry can achieve by adopting big data technologies.

Guide to Big Data FinanceCredit Scoring

Historically, loan and credit scoring methodology employed by credit bureaus and used by banks and other financial institutions has been based on a five component composite score including (i) past loan and credit applications, (ii) on time payments, (iii) types of loan and credit used, (iv) length of loan and credit history and (v) credit capacity used. Until the big data revolution, this approach has seen little innovation in making scoring a commodity.

Today, new technology platforms have opened the doors for change in credit scoring and big data scoring services are beginning to be available. Loan and credit decisions are determined in seconds using automated processes based on machine learning algorithms. The breadth of data that can be used for credit scoring has expanded considerably. For each scoring decision, big data applications collect data from a broad range of external data sources ranging from social networks, e-commerce data, economic databases, micro geographical statistics and other sources. In some cases, big data scoring technology can use upward of 10,000 data points in real-time to asses a customer’s creditworthiness.

As an extension to traditional scoring services, new technology companies using big data scoring are providing scoring-as-a-service options for online loan and credit decisions. This type of solution is provided to banks, debt collectors, e-commerce sites, leasing and other financial companies. These systems can integrate into the customer’s existing systems and/or website.

For a valuable use case example of how big data has transformed the credit scoring arena, see “Credit Scoring at Novum Bank.” To assess credit applications, Novum Bank in the Netherlands recently started using the Dell STATISTICA, the analytical software solution. In the interview, Chief Credit Risk Officer, Joop Bruinzeel talks about micro-credit, the importance of credit scoring and the use of analytical software.

Back Trading & Testing 

Another area of opportunity using big data technology is building back-testing software solutions. Back-testing refers to the process of testing a trading strategy, investment strategy, or predictive model using existing historic data. Back-testing is considered a special type of crossvalidation
applied to time series data. The goal of back-testing is to estimate the performance of a strategy as if it had been employed during a prior period. This requires simulating past conditions with sufficient detail, making one limitation of back-testing the need for detailed historical data.

Since major markets produce massive amounts of messages per day, US financial markets for example produce around 50 billion data points per day, it is extremely computationally intensive task to process them. While back-testing is a computationally intensive task it is also easy to parallelize. Multiple trading days can be back-tested simultaneously making it an ideal candidate for big data techniques such as MapReduce. In order to complete back testing within a reasonable time in an environment like the mentioned example of US markets, big data architectures like Hadoop are an invaluable tool.

The Dell™ | Cloudera™ Solution powered by Intel is a good option for addressing the needs of back-testing software. Cloudera Apache Hadoop
Distribution (CDH) Enterprise consists of tools for institutions that gather insights from vast data volumes and varied data types and find that
managing large volumes of unstructured data exceeds the capacity and capabilities of traditional data intelligence systems. Aside from MapReduce, financial industry firms can perform interactive analysis on any data stored in Hadoop HDFS and HBASE with Cloudera Impala.

Intel has a long history with Hadoop—release 2.0 of its own Hadoop distribution in 2012. By aligning the Cloudera and Intel roadmaps in 2014, Intel created the platform of choice for big data analytics in order to help accelerate industry adoption of the Hadoop data platform and enable companies to mine their data for insights that inform the business.


Next week’s article will look at Adopting Big Data for Finance. If you prefer the complete insideBIGDATA Guide to Big Data and Finance is available for download in PDF from the insideBIGDATA White Paper Library, courtesy of Dell and Intel.

Speak Your Mind



  1. We’ve been using the services of Big Data Scoring for a while now and can really recommend those guys. Good and precise scoring algorithms with no initial investment. Check them out