Feeling left out because your boss wouldn’t let you attend the Hadoop Summit 2014 happening this week? Not to worry! Here’s a free alternative, especially attractive if you happen to live/work in Los Angeles. The Big Data Camp LA 2014 is a free, all-day conference on Saturday, June 14 hosted at the DirectTV campus near LAX.
With the Hadoop Summit conference coming next week (June 3-5), it might be useful for all newbies to get up to speed with this exciting distributed computing technology. Below is a video presentation that will open doors for you about the Hadoop technology that’s taking the enterprise by storm.
Here is a great learning resource for anyone wishing to dive into the field of machine learning – a complete class “Machine Learning” from Spring 2011 at Carnegie Mellon University. The course is taught by Tom Mitchell, Chair of the Machine Learning Department.
Here is a well-crafted slideshare presentation “Hadoop, Pig and Twitter” by Kevin Weil, Analytics Lead at Twitter.
For newbie data scientists and enterprise decision makers who need a quick way to get up to speed with MapReduce, the technology underlying Hadoop, here is a slide presentation “Introduction to MapReduce: an Abstraction for Large-Scale Computation” by Ilan Horn of Google
The video presentation below comes from our friends at the San Francisco Python Meetup group. The talk discusses how AdRoll uses Python to squeeze every last bit of performance out of a single high-end server for the purpose of interactive analysis of terabyte-scale data sets.
Ever wonder what will happen when exabyte data stores are the norm, and even the parallelism of Hadoop can no longer provide the necessary processing power to address the data deluge? Quantum computing may hold the answer.
Bits are bits. Whether you are searching for whales in audio clips or trying to predict hospitalization rates based on insurance claims, the process is the same: clean the data, generate features, build a model, and iterate.
In this edition of insideBIGDATA’s Data Science 101 series, I’m going to offer up a short instructional video describing the use of the popular unsupervised learning algorithm, k-means clustering.
“Data Analytics Handbook” is a new resource meant to inform young professionals about the field of data science. Written by a group of students at UC Berkeley: Brian Liou, Tristan Tao, and Elizabeth Lin. Edition One of the book includes in-depth interviews with Data Scientists & Data Analysts.