This article is the fourth in an editorial series with a goal to provide a road map for scientific researchers wishing to capitalize on the rapid growth of big data technology for collecting, transforming, analyzing, and visualizing large scientific data sets.
IBM (NYSE:IBM) announced a major commitment to Apache Spark, the most important new open source project in a decade that is being defined by data. At the core of this commitment, IBM plans to embed Spark into its industry-leading Analytics and Commerce platforms, and to offer Spark as a service on Bluemix.
A central theme of new technology requirements surrounding big data is the ability to host data effectively. The foundation of a big data technology stack is the storage layer. A new white paper is now available that details these issues surrounding big data storage – specifically through the deployment of IBM Spectrum Scale.