Sign up for our newsletter and get the latest big data news and analysis.

A Modern Data Storage Paradigm; Reducing the High Cost of Data Management

In this special guest feature, Matt Starr, Chief Technology Officer at Spectra Logic, discusses a new storage paradigm and highlights how organizations can architect a highly reliable and affordable storage ecosystem that saves time, reduces data-loss risk and lowers storage costs. Matt brings more than 24 years of technology experience to his role. As CTO, Matt is responsible for helping to define Spectra’s technology roadmap and execute on Spectra’s technology strategy. As the company’s executive voice of the market, Matt leads Spectra’s efforts in high-performance computing, private cloud and other vertical markets, and currently directs Spectra’s Federal and APAC sales efforts. His work experience includes management roles in service, hardware design, software development, operating systems and electronic design. Matt holds a BS in electrical engineering from the University of Colorado at Colorado Springs.

With exponentially increasing data creation, the efficient storage and management of these digital assets are critical to organizations. The challenge, however, becomes how to support this data growth without straining your data storage budget, as some say that more than 80 percent of data is being stored on the wrong tier of storage, costing organizations millions of dollars a year.

Traditionally, data storage has been defined by the technology leveraged to protect data using a hierarchical pyramid structure, with the top of the pyramid designated for solid-state disk to store ‘hot’ data, SATA HDDs used to store ‘warm’ data and tape used for the base of the pyramid to archive ‘cold’ data. But as data usage becomes more complex, with increasing scale, levels of collaboration, and diverse workflows, users are being driven toward a new model for data storage.

A New Storage Model

The two-tier storage model is rapidly becoming accepted as the prime methodology to reduce data storage costs and improve storage efficiency. Its objective is to minimize storage costs by optimizing data management and storage according to business value (or need), while at the same time achieving a balance of performance, functionality and capacity.

The new paradigm combines a file-based Primary Tier and an object-based Perpetual Tier. The Primary Tier (or Project Tier) holds all in-progress data and active projects. It is made up of flash, DRAM, and high-performance disk drives to meet the requirements of critical data workflows dependent on response time. The Perpetual Tier can accommodate multiple storage media types – including any combination of cloud storage, object storage, network-attached storage (NAS) and tape – to address data protection, multi-site replication (sharing), cloud and data management workflows. Data moves seamlessly between tiers as it is manipulated, analyzed, shared and protected.

Implementing a proper storage management strategy within a two-tier paradigm allows organizations to address today’s most relevant data storage problems, while creating an environment open to future growth, development and change. Modern storage management software (SMS) maximizes efficiency by ‘smartly’ migrating data to the appropriate level of storage. To achieve this, the SMS automatically scans the Primary Tier for inactive assets, which it then identifies and moves to the Perpetual Tier. This process frees up the expensive Primary Tier of storage which reduces backups, increases performance, and saves costs. In this way, storage budgets can also be more accurately forecasted and managed; and new storage mediums can be easily deployed without an overhaul to the existing storage infrastructure.

Project-Based Archives: Flexibility, Rapid Access, and Enhanced Protection

The Perpetual Tier serves as an archive tier for large data sets and projects that can be moved immediately after data collection. Take, for example, a research facility amassing large amounts of machine-generated data from sensors to detect physical phenomena for space exploration. The output may not be analyzed immediately, but the data needs to be kept for future reference. With storage management software, individual researchers can designate the storage layer for data, by moving it to lower cost storage and bring it back as needed. Data that needs to be safeguarded can be moved to a tape storage tier – which is not externally accessible – as well as directed to the cloud for distribution or sharing. This type of project identification allows multiple forms of data from multiple types of data generators to be collected, archived and accessed in the future.

Another example is a large university supporting multiple research projects, where different departments and groups use the same data storage infrastructure under standardized service level agreements (SLAs) based on performance. Storage management software can accelerate research by moving inactive data and whole data sets off high-speed storage. Providing access across all data that has been generated by human, application or machine, storage management software allows administrators and researchers to identify data that is actively used and data that needs to be archived.

With modern storage management software, researchers can scan and identify inactive data sets to be archived, or move files or directories associated with a project to the Perpetual Tier after a large project is completed. Archived data sets can be tagged with additional metadata to enable easy search and access even if the main researcher leaves the university.

Data Usage, Not Technology

The new two-tier paradigm focuses on the actual usage of data, rather than the technology on which it resides. To effectively employ a two-tiered data storage strategy, it is necessary to determine which storage tier best suits a given class of data. Not only that, but often as data ages, it must be reclassified on a regular basis, as data storage requirements evolve over time. For instance, unstructured data moved from the Primary Tier to the Perpetual Tier can be stored on disk, tape or cloud by leveraging object-based storage which stores files as ‘objects’ and tags files with metadata that describe their contents. This eliminates complexity and allows for much greater scalability.

Data management is all about ensuring data resides on the correct storage tier at the right time. Because a manual approach to data management is virtually unsustainable, many businesses optimize their infrastructures with software to automate this process. In addition to monitoring data throughout its lifecycle and moving it to a lower cost Perpetual Tier based on age or other criteria, modern data management software provides the policy engines for IT to create the right data movement process for its many workflows and varied operational needs.

By deploying storage management software to remove “low transaction” data from the Primary Tier, organizations free up capacity on this more expensive tier, while making sure that the expanding capacities of less frequently accessed data are still available and securely preserved in the Perpetual Tier for long-term use.

With the two-tier storage model, organizations can safely support their data growth into the future as their businesses scale, with smart, cost-effective storage management software that aligns their IT infrastructures with the company’s goals to satisfy customers and sustain profitability.

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: