Sign up for our newsletter and get the latest big data news and analysis.

The Different Data Science Roles

Our friends over at Springboard just released a compelling new infographic that highlights the different roles within data science along with the different skill sets required for them.

Informatica Launches Intelligent Data Lake to Help Customers Address Critical Technology Gaps for Turning Big Data into Big Value

Informatica, a leading independent software provider focused on delivering transformative innovation for the future of all things data, announced a new end-to-end solution to turn big data into trusted data assets for faster and more sustainable business value.

New SAS® Analytics for IoT Makes it Easier to Tap Streaming Data Torrent

Long before the Internet of Things (IoT) became trendy, analytics leader SAS was probing data from sensors and other devices. SAS® Analytics for IoT is a new package of proven software products that applies SAS’ core expertise of analyzing massive amounts of data to IoT connected sensors and devices.

Open, Cloud-ready SAS® Viya™ is Next-generation High-performance Analytics and Visualization Architecture

New SAS® Viya™, the next-generation high-performance and visualization architecture from a leader in analytics, is designed to meet the business needs for analytics accessible to anyone and scalable to problems of any size.

Building a Successful Predictive Analytics Program

In this special guest feature, Jane Hendricks, WW Portfolio Marketing Lead at IBM Predictive Analytics, describes a methodology for realizing business value from predictive analytics: start by understanding the business before the data that’s available and obtainable, then develop and apply models while considering how the models can be put into practice. The article includes three short case studies that illustrate the successful application of these principles.

Data Science 101: Clustering Approaches & Techniques

The presentation below by Derek Kane provides an overview of clustering techniques, including K-Means, Hierarchical Clustering, and Gaussian Mixed Models.

GridGain Announces Support Offering for Apache® Ignite™

GridGain Systems, provider of enterprise-grade In-Memory Data Fabric solutions based on Apache® Ignite™, announced the availability of its Standard Professional Support subscription, which includes a license for the new GridGain In-Memory Data Fabric – Professional Edition 1.5, a fully supported version of Apache Ignite.

Want to Get More Out of Hadoop? Here Are 5 Ways

In this special guest feature, Ashley Stirrup, CMO at Talend, provides a useful list of five ways to get more out of Hadoop as organizations increasingly look to speed time to market, anticipate and respond to customers’ needs, and introduce new products and services.

Book Review: Why – A Guide to Finding and Using Causes

A new book, “ Why: A Guide to Finding and Using Causes ,” by Stevens Institute of Technology assistant professor of computer science Samantha Kleinberg is a necessary addition to any data scientist’s bookshelf as it helps bring focus to the dreaded “correlation does not imply causation” conundrum that affects our understanding of data-centric problems.

Unleash the Power of Data with Dataguise DgSecure for Amazon Web Services

Dataguise, a technology leader in secure business execution, announced Dataguise DgSecure for the detection, protection and monitoring of sensitive data across Amazon Simple Storage Service (Amazon S3) via the Amazon Elastic MapReduce (Amazon EMR) platform.