Data Lake: The Definitive Guide

This paper provides the definitive guide on the critical areas of importance to bring data lake organization, governance, and security to the forefront of the conversation.

Research of Data Lakes

This study was designed to document key perceptions, challenges, and successes by focusing on data organization, integration, security, and definitional clarification to address key areas of concern and interest in ongoing data lake adoption. The intent of the survey and this corresponding report is to understand and share the current and planned adoption of technologies in the Hadoop ecosystem, intended specifically for a data lake strategy, and to learn how adopting companies are addressing critical data lake success factors, including rethinking data for the long-term, establishing governance first, and tackling security needs upfront. The survey and report also identify emergent areas of concern and new areas of clarification needed for data lake maturity.

Big Data Analytics: IBM

Businesses are discovering the huge potential of big data analytics across all dimensions of the business, from defining corporate strategy to managing customer relationships, and from improving operations to gaining competitive edge. The open source Apache Hadoop project, a software framework that enables high-performance analytics on unstructured data sets, is the centerpiece of big data solutions. Hadoop is designed to process data-intensive computational tasks, in parallel and at a scale, that previously were possible only in high-performance computing (HPC) environments.

Presto: Open Source for the Enterprise

Presto addresses a real need for a portable SQL on Hadoop tool. It is architected from the ground up for high performance interactive query processing. Open source is a fount of continual innovation, especially with regard to big data. In addition, there are strong tools that come with specific Hadoop distributions. The fact is that organizations will deploy multiple tools. For organizations moving toward a Unified Data Architecture, the rationale for adopting Presto is even stronger.

Data Management Platform Whitepaper

Discover how Data Management Platforms are allowing marketers to merge data from advertising partners and their customer databases to power more individualized marketing.

Streaming Data Analytics Architecture

Streaming analytics is fast becoming a must-have technology for enterprises seeking to transform their analytic to take advantage of “fast data” sources and build real-time or near real-time applications.

Software Defined Infrastructure

Software defined infrastructure (SDI) enables organizations to deliver HPC services in the most efficient way possible, optimizing resource utilization to accelerate time to results and reduce costs. Software Defined Infrastructure is the foundation for a fully integrated environment, optimizing compute, storage and networking infrastructure to quickly adapt to changing business requirements, and dynamically managing workloads and data, transforming a static infrastructure into a workload- , resource- and data-aware environment.

Enterprise Data Warehouse

The EDW market continues to evolve as enterprise architecture pros recognize that improved scalability, better performance, and deeper integration with hadoop and NosQl platforms will address their top challenges.

Big Data Solution Using IBM Spectrum Scale

Businesses are discovering the huge potential of big data analytics across all dimensions of the business, from defining corporate strategy to managing customer relationships, and from improving operations to gaining competitive edge. The open source Apache Hadoop project, a software framework that enables high-performance analytics on unstructured data sets, is the centerpiece of big data solutions. Hadoop is designed to process data-intensive computational tasks, in parallel and at a scale that previously were possible only in high-performance computing (HPC) environments.

OpenStack Storage with IBM

This paper reviews the increasingly popular OpenStack cloud platform and the abilities that IBM storage solutions provide to enable and enhance OpenStack deployments. But before addressing those specifics, it is useful to remind ourselves of the “whys and wherefores” of cloud computing.