The Big Data Revolution: All about Big Data, Why and How it’s Relevant for Your Business in 2015

Print Friendly, PDF & Email

Jenny_Richards_RemoteDBAIn this special guest feature, Jenny Richards of RemoteDBA.com provides a high-level view of the big data industry and gives a few predictions for 2015. Jenny Richards is content marketer for RemoteDBA.com with one of the leading companies in the country which provides excellent remote DBA support services.

What is Big Data

In a nutshell, big data is a term that was coined to describe a rapidly changing growth in the accessibility of structured as well as unstructured data, usually within a business. Big data has come under a lot of analysis presently, probably because it’s as essential to a business – and its surrounding community – as the Internet now is.

The reason for this is simple: when you have more data, your analysis can be more accurate. This increase in accuracy in turn leads to better and more confident decision making, by the executive, which leads to higher operational efficiency, reduction in risk, cost-efficacy and increase in profitability as a final result. Where rubber meets the road, that’s what every organization wants.

Big Data Characteristics

As far back as 2001, big data was already under the spotlight and was characterized according to the following attributes which are still applicable today:

Volume

The volume of data in an organization increases for many reasons: data from consumer transactions, unstructured data from social platforms, machine-to-machine or sensor data. Given decreasing costs of storage, the most essential aspect, more than storage, is determination of relevance of such large volumes through germane analytical methods.

Velocity

Large organizations, especially, have data flowing in at extremely high speeds, and this data must be dealt with accordingly to prevent systemic clogs

Variety

Data comes in formats of all kinds – structured and unstructured – all of which must be effectively managed, merged and governed as needed for operational efficiency.

Why Big Data is a Big Deal

Large amounts of data are now flowing in from various sources, but that’s not the good news. The good news is that this data can now be useful, thanks to improvements in computational and statistical methods and development of algorithmic schema for a number of applications. In addition, new ways of data set linking have been developed, as well as creative data visualization techniques, all essential to aforementioned data analysis for various uses.

Big data analysis is pervading into virtually every field: science and academia, law, industry, government and even non-profits. Modern data analysis is revolutionizing schemes of thought and previously held notions, providing useful insights from the volumes of data now available in every field. The baseline is this: given sufficient and quantifiable information on any subject, a modern statistical technique will outperform individuals each time.

Big data is no visitor to marketing, where it is used to generate recommendation engines, much like Amazon and Netflix use to suggest purchases a consumer might be interested in based on previous purchases/interests. There are also many applications of big data in the public sector: determination of crime hot-zones, genomic analysis to improve drought-resistance of crops, identification of evolutionary patterns and disease resistance, etc.

Even social media utilize big data analysis, to determine what ads to be shown to what set based on their interests and likes, for that ‘people you may know’ segment. Statistical methods have been used to derive trends and make sense of what a billion people are saying every two days. That’s how powerful big data analysis has become.

The potential benefits of big data to society extend further than anything we’ve seen till now, as is the potential for doing good with the volumes of data in the ‘hands’ of businesses today.

This, of course, does not only depend on having the data, but rather there must be a wise application by the owners and analysts of the data: asking the right questions, designing tests and then making conclusions from the data collected. This can help retrieve causative and co-relational data, which would be more beneficial for socially conscious projects in the long run.

What Makes Big Data Interesting

Big data is not made interesting by the fact that it is, in fact, big. We’ve had exponentially growing volumes of data within reach for decades. There are two reasons big data is interesting, and they are discussed below:

From ETL to post-query analysis

Say you are offered some data source. The conventional data warehouse guys will turn to their trusty ETL tools (extract, transform, and load) to create relational tables out of the data. Now, this works if the data source was structured in nature, like a banking or billing system. Take those same techniques and apply them to big data – diverse formats from diverse sources – and the same techniques crumble.

Conventional data analysis involves defined analysis of a fixed relational data set through application of pre-programmed schema. Now, should someone want to track a new function, the entire schema has to be changed, and that’s not an easy task.

This is where big data becomes interesting. You design analytical and statistical schemes for the analysis of large volumes of data without having to change the schema each time a new metric needs to be established or a new variable is introduced into the equation.

From the conventional ETL paradigm, we must shift to late binding – not defining schemas at the outset, but rather at the point where a query is raised, so that an answer meaningful to decision-making is retrieved. With big data, it’s impossible to predict all questions that can be asked from the data, which makes initial schema generation impossible. But implementing this paradigm shift with the modern business intelligence tools presents an additional challenge.

Data storage – What goes where?

Big data requires big storage space, which introduces another challenge: getting the balance between performance and cost in the platforms being deployed to handle big data. The prevalent school of thought is that all data is useful, possibly later if not now, and therefore must be stored. This is good news for storage vendors, but CFOs don’t like it as much.

Sure, you can use high-density disk drives for cheap data storage, but this limits the I/O performance. You can have your data stored in-memory, which is great performance-wise, but rather cost-prohibitive, even if the prices are going down by 60% every three years.

This is luckily solved by the 80-20 rule. Since 80% of I/O queries are directed towards 20% of the data, the latter might be stored in memory and the remaining 80% in whatever storage that offers the cheapest price per volume. This makes everybody happy.

It’s easy to configure the hardware to do what needs doing, it’s the software that offers a real challenge, now that hot data today could be totally frozen tomorrow. On the other hand, the cost of human data analysts to determine what goes where is prohibitive, which brings the second challenge: programming machines to figure out which data goes where.

3 Big Data Predictions 

Big data has opened many doors for businesses to retrieve and analyze large volumes of data for their benefit. However, internally collected data offers only a limited picture, necessitating a shift in days to come if business processes are to be fully transformed. The following are some predictions regarding how this might occur:

  1. 80% of last decade’s enterprise processes and products will be digitized and/or eliminated by 2020

With the growth of the IoT presence, the ability to generate new real-time information types will increase, as will the active participation towards value addition to industry’s streams. Digital means will be used to engage employees, citizens and customers. Operational processes will be automated and digitized, along with any traditional/analog/manual processes.

Decision making will be based on algorithms (automated judgment) so that the ‘things’ act as agents of themselves, businesses and people involved. Improved intelligence, communications and connectivity will make things ideal agents for services currently delivered by people.

  1. Over 30% of data accessible to businesses will be availed by data brokerage services to enable better contextual decision-making by 2017

Real-time awareness is necessary for digital businesses, for goings-on within and outside the organization. The enterprise data now stored within organizations’ storage vaults is insufficient to provide this kind of contextual awareness, requiring mechanisms to enable businesses obtain additional data from external sources to augment internal stores for better decision making.

However, exogenous data from different sources is rather fragmented, unstructured and generally ‘noisy’. There will therefore be need to have additional services to deliver contextual data to organizations for human and automated decision-making. These information service providers will become pivotal to BI operations.

  1. Over 20% of consumer-centric analytical deployments will be directed to offer product tracking to strengthen IoT

Driven by the Nexus of Forces, consumers will have greater demand for information from their vendors. New styles of customer-centric analytics will emerge e.g. product tracking, not only to provide geospatial locations for products, but also submit performance data. This will offer businesses an opportunity to improve partner and consumer relations as well as create transparency, and this will make up a central part of an organization’s business models.

All these will of course only be possible if businesses will strengthen their capabilities in terms of collection, management and leveraging big data, social media, IoT data, local and central government data as well as data from partners, customers and suppliers, among other sources of exogenous data.

Areas Which Will Be Impacted by Big Data in 2015

  1. Big Data analytics democratization

The growth of cloud-centered data analysis and services over the last year has improved its cost-efficacy, and this trend is not likely to decline this year. As such, organizations that could not afford advanced big data analytics will begin to embrace management of their structured and unstructured data sources. The cloud will offer organizations a wider range of options to gain such services affordably.

  1. Growth of unstructured data

Unstructured data sources including social media, IoT data, video, audio and images as well as machine sensor data, among others, will exponentially grow this year. Organizations will therefore look out for solutions to merge this data with their structured data sources to increase the contextual meaning of the latter

  1. Predictive analytics make it into the mainstream

Predictive analytics will continue to evolve from a cool extra to must-have functionality. Organizations will have to grow their capacity to predictively and proactively act on big data information at the speed of reception for better business process management. Otherwise, there would be no point to having such data in the first place.

  1. Changes in IT operations

Big data will revolutionize how companies handle their internal IT departments even before moving on to sales and marketing. From identification of security threats to intelligent IT operations, the conventional service desk will be transformed to a powerful tool for service delivery.

  1. All-inclusivity of big data

Data science and analysis will be demanded for by every business, including demand for revolutionary minds and paradigm shifts from traditional BI and analysis to innovative techniques for data handling.

How Remote Database Companies will Benefit from Big Data

The role of the DBA is no doubt changing in light of the big data explosion. Exactly what the change is will depend on specific industries and companies. In some circles, DB administration is deemed less significant in light of the advancements in automated data handling, while others consider the role to be even more important now than before.

Given that most of an organization’s data can now be available in real-time, the DBA or remote DBA has a more difficult job. Compare this to when organizations could only store so much data and the rest was automatically redirected to secondary storage. Hence, the DBA handles a finite dataset at any given time.

Data volumes have increased without a corresponding increase in the number of managers or DBAs, which makes them more overwhelmed. This necessitates the engagement of additional services to manage this new volume, which is where remote DBAs come in. Now that cloud storage is here with us, remote access and management is much easier with some vendors extending further to offer DBA services along with their cloud storage facilities.

However, the traditional role of the DBA has not become obsolete as yet. If anything, it has grown broader to encompass many other areas that can collectively be classified as data management. DBAs that obtain these skills will be quite highly demanded for in this revolution. Nonetheless, the original skills of database administration, whether of schema design and modification or of performance instance tuning, will remain relevant over the long term.

Conclusion

It is clear that the world of Big Data carries with it massive potential that will alter the course of life and business as we know it. If its potential is maximized even by just a fraction, millions of lives across the world will be changed for the better, as will businesses and enterprises in every field of operation. It’s introducing a new and exciting realm to everything, and that’s just a tip of the big data iceberg.

 

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind

*