Big Data Solution Using IBM Spectrum Scale

White Papers > Data Storage > Big Data Solution Using IBM Spectrum Scale

Every day, the world creates 2.5 quintillion bytes of data. In fact,
90 percent of the data in the world today has been created in the last two years alone. While there is much talk about big data, it is not mere hype. Businesses are realizing tangible results from investments in big data analytics, and IBM’s big data platform is helping enterprises across all industries using there spectrum scale. IBM is unique in having developed an enterprise-class big data platform that allows you to address the full spectrum of big data business challenges.

IBM® Spectrum ScaleTM, formerly IBM General Parallel File System (IBM GPFSTM), offers an enterprise-class alternative to Hadoop Distributed File System (HDFS) for building big data platforms.
Part of the IBM Spectrum StorageTM family, Spectrum Scale is a POSIX-compliant, high-performing and proven technology that is found in thousands of mission-critical commercial installations worldwide. Spectrum Scale provides a range of enterprise-class data management features.

IBM Spectrum Scale can be deployed independently or with IBM’s big data platform, consisting of IBM BigInsightsTM for Apache Hadoop and IBM PlatformTM Symphony. This document describes best practices for deploying Spectrum Scale in such environments to help ensure optimal performance and reliability.

Businesses are discovering the huge potential of big dataanalytics across all dimensions of the business, from defining corporate strategy to managing customer relationships, and from improving operations to gaining competitive edge. The open source Apache Hadoop project, a software framework that enables high-performance analytics on unstructured data sets, is the centerpiece of big data solutions. Hadoop is designed to process data-intensive computational tasks, in parallel and at a scale that previously were possible only in high-performance computing (HPC) environments. The Hadoop ecosystem consists of many open source projects. One of the central components is the HDFS, a distributed file system designed to run on commodity hardware. Other related projects facilitate workflow and the coordination of jobs, support data movement between Hadoop and other systems, and implement scalable machine learning and data mining algorithms. However, HDFS lacks the enterprise-class functions necessary for reliability, data management and data governance.

    Contact Info

    Work Email*
    First Name*
    Last Name*
    Zip/Postal Code*

    Company Info

    Company Size*
    Job Role*

    All information that you supply is protected by our privacy policy. By submitting your information you agree to our Terms of Use.
    * All fields required.