Sign up for our newsletter and get the latest big data news and analysis.

Productionizing Hadoop: 7 Architectural Best Practices

Big Data will change the way your organization responds to business opportunities. But to reap its full benefits, you have to move from proof of concept into full production. Here is an informative, 52-minute presentation that provides the guidelines for successfully integrating Hadoop into your standard data center processes.

Data Science 101: Building Your Data Science Toolbox

Jeremy Howard made a presentation to the Melbourne R meetup group, where he gave a brief overview of his “data scientist’s toolbox” (using a few Kaggle competitions as practical examples), and also provided an introduction to ensembles of decision trees (including the well-known Random Forest™ algorithm).

Data Science 101: 250 Years of Bayes Theory

Bayes_Theorem

It’s been more than 250 years since the appearance of Bayes theorem (named after English statistician, philosopher and Presbyterian minister Thomas Bayes: 1701-1761), one of the two fundamental inferential principles of mathematical statistics.

Interview: Data Analytics and the Ubiquitous Internet of Things

ldapimage

We sat down with Cristian Borcea, PhD from the New Jersey Institute of Technology to discuss the IoT and Big Data applications. “New machine learning techniques could help us extract knowledge from these data – this happens especially for knowledge that we don’t expect and we don’t even know exists – we cannot search for something that we don’t know exists.”

Learning Data Science in Total Immersion

Zipfian

San Francisco based Zipfian Academy approaches data science education the way some approach learning a new language – total immersion. The company offers a 12-week intensive advanced data science training program in a modern lab environment.

Becoming a Data Scientist – What Does it Take?

I’ve been monitoring a curious and lively discussion over on LinkedIn – “Is it necessary to have a Masters Degree to become a data scientist?” The comments I’ve seen have exhibited a number of points of view on the matter that I think are reflective of the questions on many people’s minds – both those wanting to become a data scientist and those wanting to hire a data scientist.

DataCamp – A Unique Learning Resource

DataCamp

I recently got an e-mail with the salutation “Hi Data Scientist.” Pretty smart e-mail marketing campaign because, lo-and-behold, I am a data scientist and I actually was interested in the e-mail! It was from a company called DataCamp which I learned later used to be DataMind. I knew them from their R-Fiddle tool for learning R online.

How Companies are Using Spark

Spark

The video below comes to us from the Strata Conference 2014: How Companies are Using Spark, and Where the Edge in Big Data Will Be. While the first big data systems made a new class of applications possible, organizations must now compete on the speed and sophistication with which they can draw value from data. […]

Strata Conference 2014 Slides and Videos

Last week saw evidence for the big data industry steamroller effect as the Strata Conference 2014 in Santa Clara came and went. With thousands of attendees, an abundance of informative presentations, and a very healthy exhibitor ecosystem, the show defined the current state-of-the-art for all that is big data. If you missed the big event, O’Reilly Media has graciously made available the slides and videos for some of the presentations.

Interview: Inktank Joins Forces with Open Source Mainstays Red Hat and OpenStack

Ross Turk

Intank Ceph, the open source software-defined storage system, has expanded its offerings and its customer base by supporting new Red Hat and OpenStack products. To get the specifics, we caught up with Ross Turk, Vice President of Community at Inktank.