Data Centric Approach to Big Data Storage

Print Friendly, PDF & Email

Shahbaz-Ali2In this special guest feature,  Shahbaz Ali of Tarmin highlights how data centric approaches to big data storage enable enterprises to make better sense of the world. Shahbaz Ali is the President, CEO and Co-founder of Tarmin.  Shahbaz is a seasoned executive and visionary entrepreneur with more than 20 years of experience creating dynamic solutions for enterprises. Shahbaz is a data management visionary who has successfully co-developed Tarmin’s Data Defined Storage solutions and its award winning GridBank Data Management Platform. Shahbaz holds a BSc (Hons) in Software Engineering from London Southbank University and has completed a PhD course of study in Software Requirements Engineering from the Open University.

Big data offers answers to questions that were previously unobtainable and insights into behaviors that were previously unknown. Big data is pervading organizations, revolutionizing industries and creating a dynamic environment for executives to re-evaluate how to store and use data to strategically add value to their bottom line. The development of big data and analytics capabilities are disruptive innovations which are reconfiguring the way we store data, but more importantly, why we store data. Organization are no longer storing data merely to look back in history, they are analyzing it to predict the future and make real time business decisions that will grant them a sustainable competitive advantage.

Big data is valuable to an organization in a variety of ways – the primary value of data is the first few minutes or hours after the data is created and flows through the system. But now organizations are starting to realize data has a secondary value, the value of data collected and analyzed over time. For example, an Oil and Gas company spends exorbitant amounts of money launching a seismic survey which results in high resolution images that are hundreds of GB to TBs in size and drive up the storage costs. At that particular time it may not be economically viable to exploit the implied natural resources, but that data and those reserves may be extremely valuable down the road. As time passes, it is the organization with the best array of consolidated data that will be first mover, exploiting the opportunity in a zero sum game where the winner takes all. Each individual data point is not enough, but when data is leveraged in a strategic and holistic fashion it becomes invaluable.

The availability of big data, coupled with the desire for data analytics capabilities has put a strain on traditional storage infrastructures and methodologies. New forms of data creation has exhausted customary storage architectures, creating unprecedented challenges for organizations. The rapidly growing concept of big data, and the costs and difficulties of generating, processing and analyzing it has often led to failed initiatives and unsuccessful deployments.

A number of technological developments, such as BYOD and mobility initiatives and social streaming has caused a tipping point in the storage industry. Conventional storage infrastructures simply cannot cope with the abundance, variety and velocity of big data. The explosion of this data growth reframes key questions about the components of knowledge, how organizations should engage with their data, and the nature of data-driven decision making. A byproduct of this determinant has been the evolution of the data centric approach. The practice of building data management strategies based on the value of the data, which creates an enterprise storage architecture that goes beyond saving on capital hardware costs.

With the respect to knowledge production, data centric approaches to infrastructure, such as Data Defined Storage, are transforming the way organizations manage, protect and gain value from big data by uniting application, information and storage tiers into a single integrated architecture. Data centric approaches no longer focus on the media type, size and location of data, but instead views data based on its inherent value to the organization.

A data centric approach enables the unification of data stores, creating virtualized storage pools of data, which reside on storage that is determined based on the value of data, and the performance, protection and frequency of data access during its lifecycle. Data can be automatically migrated to low cost storage pools when required. Unnecessary copies of data can be removed through deduplication and defensible disposal policies, and further optimization can be achieved through compression. A single global namespace, exposing data through multiple protocols, eliminates data silos with a single unified view of corporate data – further reducing infrastructure complexity.

This modern data centric approach employs technology that extracts full content and indexes metadata at the time of ingestion and stores it in a distributed repository, which crosses data type and location boundaries to provide a global access for on-demand enterprise search and discovery maximizing the primary and secondary value of data collected. Enterprises are able to create data retention, preservation, access and security policies across the entire data estate, simplifying data governance, reducing risk, improving productivity, and decreasing costs.

Data centric approaches to storage enables enterprises to make better sense of the world, through the use of analytics tools such as Hadoop, organizations are able to remove data silos and provide real-time insight, without having to migrate data to a separate analytics platform. Knowledge oriented businesses that employ a data centric approach to big data storage can gain substantial value and a competitive edge. Having a centralized repository of information at your fingertips allows for increased productivity and the ability to make good timely decisions based on factual information. There is an urgent need for enterprises to widely adopt data centric approaches that addresses the implications of big data and data analytics, a new phenomenon that has barely scratched the surface despite the speed of change in the data landscape.


Sign up for the free insideBIGDATA newsletter.


Speak Your Mind