Amazon Elastic MapReduce (Amazon EMR) makes it easy to provision and manage Hadoop in the AWS Cloud. Hadoop is available in multiple distributions and Amazon EMR gives you the option of using the Amazon Distribution or the MapR Distribution for Hadoop.
An integral tool found in data science is Time Series Forecasting. Here is a useful instructional video on the subject from one of the authors of a free eBook available on OTexts – “Forecasting: Principles and Practice.” The presentation “Forecasting Time Series Using R” is made by Professor of Statistics Rob J Hyndman.
Richard Feynman, winner of the 1965 Nobel Prize in Physics and world renown “curious character,” gives us an insightful lecture about computer heuristics: how computers work, how they file information, how they handle data, how they use their information in allocated processing in a finite amount of time to solve problems and how they actually compute values of interest to human beings.
Stephen Wolfram, founder of Wolfram Research and creator of Mathematica, just announced the new Wolfram Programming Language. This new knowledge-based language could be a game changer in data science.
Microsoft Research, the research arm of the software giant, is a hotbed of data science and machine learning research. Microsoft has the resources to hire the best and brightest researchers from around the globe. A recent publication is available for download (PDF): “Deep Learning: Methods and Applications” by Li Deng and Dong Yu, two prominent researchers in the field.
Big Data will change the way your organization responds to business opportunities. But to reap its full benefits, you have to move from proof of concept into full production. Here is an informative, 52-minute presentation that provides the guidelines for successfully integrating Hadoop into your standard data center processes.
Jeremy Howard made a presentation to the Melbourne R meetup group, where he gave a brief overview of his “data scientist’s toolbox” (using a few Kaggle competitions as practical examples), and also provided an introduction to ensembles of decision trees (including the well-known Random Forest™ algorithm).