Sign up for our newsletter and get the latest big data news and analysis.

Data Science 101: Expressing Yourself in R

Brought to you by our friends over at the Stanford Center for Professional Development is this compelling data science education resource: “Expressing yourself in R” – by Hadley Wickham, Rice University.

Data Science 101: Cassandra Tutorial for Beginners

Provided by our friends over at Edureka, Module 1 of their Apache Cassandra course below discusses the fundamental concepts of using a highly-scalable, column-oriented database to implement appropriate use cases.

Data Science 101: Support Vector Machines

Support Vector Machines (SVM) is an important and widely used machine learning algorithm. In order to fully understand SVMs, you need to have a fundamental understanding of how the statistical learning method functions. Here is a useful lecture on SVM coming from MIT OpenCourseware.

Deep Learning, Self-Taught Learning and Unsupervised Feature Learning

The video presentation below is a highly compelling talk by Stanford University professor and Coursera co-founder, Dr. Andrew Ng. Andrew addresses a graduate summer school audience at UCLA’s IPAM (Institute for Pure & Applied Mathematics) on the topic – Deep Learning, Feature Learning.

Data Science 101: Mining Big Data with Apache Spark

Mining Big Data can be an incredibly frustrating experience due to its inherent complexity and a lack of tools.

Data Science 101: Data Agnosticism – Feature Engineering Without Domain Expertise

From the SciPy2013 conference, here is a compelling talk “Data Agnosticism: Feature Engineering Without Domain Expertise” by Nicholas Kridler of Accretive Health in Chicago.

Data Science 101: Apache YARN Usage Tips and Guidelines

Hadoop 2.0 YARN architecture

Hadoop YARN (Yet Another Resource Negotiator) is a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of user applications. YARN was added as part of Hadoop 2.0. Over the past several months of going to conferences like Hadoop Summit, attending big data Meetup groups like LA Big Data Users […]

Data Science 101: Building Brains to Understand the World’s Data

For this segment of insideBIGDATA Data Science 101, we have a very compelling Google Tech Talk “Building Brains to Understand the World’s Data” presented by Jeff Hawkins, co-founder of Numenta and who also founded Palm and Handspring.

Data Science 101: Parallel Iterative Deep Learning on Hadoop’s Next​-Gen YARN


Presented at the recent O’Reilly OSCON – Open Source Convention 2014 by Josh Patterson (Patterson Consulting) and Adam Gibson ( is “Introduction to Parallel Iterative Deep Learning on Hadoop’s Next​-Generation YARN Framework.”

Data Science 101: An Interview with Hadley Wickham


RStudio’s Chief Scientist Hadley Wickman was interviewed by DataScience.LA’s Eduardo Arino de la Rubia during the useR!2014 conference at UCLA this past July.