Sign up for our newsletter and get the latest big data news and analysis.

Ask a Data Scientist: Data Leakage

datascientist2_featured

Welcome back to our series of articles sponsored by Intel – “Ask a Data Scientist.” Once a week you’ll see reader submitted questions of varying levels of technical detail answered by a practicing data scientist – sometimes by me and other times by an Intel data scientist. This week’s question is from a reader who asks for an explanation of data leakage.

Intel Rallies Thought Leaders to Rethink Data Privacy to Spur Innovation

Bigdata_privacy_feature

Intel Corporation recently convened thought leaders in technology, healthcare, education and smart cities industries to encourage action on data privacy issues. Malcolm Harkins, vice president and chief security and privacy officer for Intel, said that the potential to unlock revolutionary discoveries is at stake, and called on the industry to be more transparent and accountable when collecting and using consumer data.

Ask a Data Scientist: Unsupervised Learning

Dr. Andrew W. Wicker, Data Scientist,  Intel Corporation

Welcome back to the “Ask a Data Scientist” article series. This week’s question is from a reader who asks for an overview of unsupervised machine learning.

Ask a Data Scientist: The Data Science Process

datascientist2_featured

Welcome back to our series of articles sponsored by Intel – “Ask a Data Scientist.” This week’s question is from a reader who wonders if there is a general process for conducting data science projects.

Lustre Scalability, Affordability, Manageability

lustre logo

This fifth article is an editorial series that explores Lustre solutions in the cloud for an exploding commercial data universe. This week’s looks at Lustre scalability, affordability and manageability.

Ask a Data Scientist: The Bias vs. Variance Tradeoff

datascientist2_featured

Welcome back to our series of articles sponsored by Intel – “Ask a Data Scientist.” This week’s question is from a reader who wants an explanation of the “bias vs. variance tradeoff in statistical learning.”

Enterprise Grade Lustre in the Clouds

commercial lustre

With the release of Intel® Cloud Edition for Lustre software in collaboration with key cloud infrastructure providers like Amazon Web Services (AWS), commercial customers have an ideal opportunity to employ a production-ready version of Lustre—optimized for business HPDA—in a pay-as-you-go cloud environment.

Big Data for Finance – Security and Regulatory Compliance Considerations

Guide to Big Data Finance - Thumbnail

This article is the fifth and last in an editorial series that has the goal to provide direction for enterprise thought leaders on ways of leveraging big data technologies in support of analytics proficiencies designed to work more independently and effectively in today’s climate of working to increase the value of corporate data assets.

Ask a Data Scientist: Curse of Dimensionality

datascientist2_featured

Welcome back to our series of articles sponsored by Intel – “Ask a Data Scientist.” Once a week you’ll see reader submitted questions of varying levels of technical detail answered by a practicing data scientist – sometimes by me and other times by an Intel data scientist. This week’s question is from a reader who wants to know more about the “curse of dimensionality.”

The Analytics Frontier of the Hadoop Eco-System

Ted Wilkie

“The Hadoop MapReduce framework grew out of an effort to make it easy to express and parallelize simple computations that were routinely performed at Google. It wasn’t long before libraries, like Apache Mahout, were developed to enable matrix factorization, clustering, regression, and other more complex analyses on Hadoop. Now, many of these libraries and their workloads are migrating to Apache Spark because it supports a wider class of applications than MapReduce and is more appropriate for iterative algorithms, interactive processing, and streaming applications.”