Tarmin steps away from media dependent storage and reliance on costly physical infrastructure to a more storage-centric approach to gain value from massive data flows. We caught up with Shahbaz Ali, CEO/Founder of Tarmin, to getting a better understanding of what his company does and what data defined storage is all about.
insideBIGDATA: I understand that Tarmin is all about Data Defined Storage. Can you please tell the uninitiated among us what that is exactly? And can you tell us a little about the origins of the technology.
Shahbaz Ali: Data Defined Storage is a next generation storage and data management platform built around the notion of data centricity, which differs from traditional storage architectures that tend to have a ‘media centric’ focus. A media centric focus generally relies on throwing more infrastructure elements (disks, etc.) to resolve a number of problems associated with massive data growth. Today’s organizations need to do more than just store data, they need to find ways to gain value from data. Data Defined Storage focuses on metadata (the data about data) with an emphasis on the content, meaning and value of information instead of the media, type and location of data. This approach enables organizations to take a single scale-out, unified approach to managing unstructured data across large, distributed locations which erases the legacy silo information storage barriers, reduces risk and provides greater business agility, enhance decision making and ultimately, a more productive work force.
insideBIGDATA: How does Tarmin uniquely approach this?
Shahbaz Ali: Tarmin delivers the three pillars of Data Defined Storage though it’s GridBank Data Management Platform.
- Media independent data storage. GridBank provides high performance, random access enterprise object storage that optimizes data volumes, aggregates and consolidates existing storage investments, by creating virtualized storage pools based on cost/performance and capacity characteristics. Data is stored in the appropriate pool, according to its value. Storage pools can use fast disk, nearline disk, tape or cloud from any manufacturer, utilizing each of its strengths through intelligent tiering and deduplication/compression policies, driving reduced TCO while presenting consolidated information into a single unified view of data.
- Data Security and Identity Management. This delivers a complete information governance framework designed to mitigate data related risk by providing organizations with policy-based retention management and disposal, granular legal hold and automated data migration for archiving and tiering, along with end-to-end identity centric data protection down to the individual user and device level, providing greater security.
- Distributed Metadata Repository. This Captures the value of data and provides global enterprise search for enhanced business agility and big analytics integration to gain insights from critical data delivering improved competitive advantage.
insideBIGDATA: For what organizations might this be useful?
Shahbaz Ali: New data types and sources driven by volume, variety and velocity occurs in almost every industry and organization. Verticals that are data driven intensive and that continuously strive to gain better insights from data, using big data analytics tools such as Hadoop, gain tremendous competitive advantage by using a Data Defined Storage approach. Tarmin GridBank supports a wide variety of data intensive industries such as Financial Services, Healthcare, Education, Oil and Gas, or Life Sciences, all of which face challenges with digitized unstructured data, making them ideal for employing a Data Defined Storage solution.
insideBIGDATA: Specifically in the enterprise realm, what might these companies glean from your suite of products?
Shahbaz Ali: Financial Service Organizations create huge volumes of unstructured data and are faced with commercial risks, legal risks and compliance mandates associated with long term data storage, security and access. FSOs are turning to Big Data, using insights taken from daily transactions, market feeds, customer service records, location data, and click streams to carve out new business models and services to transform the go to market strategy. By implementing a Data Defined Storage solution, using Tarmin GridBank, the Finance industry is able to reduce infrastructure cost, satisfy risk management and industry wide regulatory compliance requirements for data retention and e-Discovery and accessibility for all devices including mobile devices. In addition, FSOs will be able to mine the net worth of their data and manage through data-in-place dashboarding and analytics. This not only creates potential cost savings but also adds to the company’s competitive advantage.
Data is the lifeblood of the Oil and Gas industry; GridBank enables the most data and compute intensive applications consisting of seismic imaging data and other geophysical information to be optimized for Oil and Gas exploration. Data must be kept for decades or until technology has reached a point to exploit the natural resource. GridBank indexes and catalogues all data, and stores it securely on tape for the long-term, ready for future search and analytics. Delivering on all Oil and Gas data management requirements, reducing costs and increasing storage utilization, GridBank provides a single view of data across the enterprise to accelerate the monetization of Oil and Gas digital assets.
Universities, research institutions and other higher education facilities have a diverse set of applications to support, including HPC-based research and engineering applications, and typically battle budget constraints. Ideally, the research they produce and publish should be easily searchable and accessible by authorized personnel through a Cloud-type interface. Tarmin Gridbank consolidates the storage required to support all this data, simplifying the creation of storage pools of varying cost and performance by using a scale-out infrastructure that supports the HPC multi-petabyte world.
insideBIGDATA: I understand that one of your goals is to “monetize data”. That probably gets a lot of attention for obvious reasons. How do you go about this?
Shahbaz Ali: Information is knowledge, and knowledge is power; without these, organizations are at a significant competitive disadvantage. Complete understanding of an organization’s landscape, competitors, customer requirements, and the market landscape directly translates to actionable insights. Tarmin makes data monetization possible through our unique distributed metadata repository. GridBank can support the ingestion of standard files and over 550 industry specific file types, as well as email, SharePoint, and social streams. GridBank conducts full indexing across multiple file types within an organization. In addition to indexing basic file metadata, GridBank also indexes content and custom metadata. All ‘data about data’ is stored in a distributed metadata repository giving global access for enterprise search and discovery, and large scale analytics.
GridBank’s analytics integration framework provides content-based indexing and filtering across all unstructured data sources. Unlike ETL tools, there is no need to separate data into a separate analytics pool. The metadata repository efficiently exposes content throughout the grid, allowing GridBank to point analytics tools at data-in-place while maintaining high performance.
Making decisions based on data driven outcomes results in increased customer satisfaction, retention and loyalty, better targeted products and services, and competitive advantage.
insideBIGDATA: This seems to be pretty cutting edge stuff. What can we look forward to from Tarmin?
Shahbaz Ali: Tarmin knows that as businesses become more data driven and mobile, they will continue to look for better ways to cost effectively consolidate, store and protect data from risk. There is also a strong trend within organizations to better leverage big data via analytics to drive business and market growth. As these needs continue to grow, we are looking forward to an industrywide shift away from media centric, focusing on the content of data above all else.
We are continuing to align GridBank with the growing demands of the information economy to deliver a future proof data centric solution for organizations. With GridBank 4.0, we are adding a Linux version to our existing Windows-based product to address Petabyte scale Linux customers. In addition, we continue to expand data capture and data accessibility features to include social streams and all mobile devices. All data will be streamed and indexed into the GridBank distributed metadata repository to add significant ROI, reduce overall TCO and gain greater insights.