Data Science 101: Examining the Requests Made by the Top 100 Sites

File Types Correlation Plot

For our latest installment of the insideBIGDATA Data Science 101 series, I thought I’d do something a bit different. Here is a sample analysis by data scientist and blogger Dan Goldin who published some nice results using R to assess the web requests originating from the top 100 Internet sites.

The Wolfram Programming Language for Data Science

Stephen Wolfram, founder of Wolfram Research and creator of Mathematica, just announced the new Wolfram Programming Language. This new knowledge-based language could be a game changer in data science.

Certona Predicts Consumer Behavior with Patented Technology


Certona, a leading provider of real-time omnichannel personalization for the many of the largest brands and retailers, today announced that the United States (US) Patent Office has issued the company a patent for representing and predicting human behavior.

Data Science 101: Deep Learning Methods and Applications


Microsoft Research, the research arm of the software giant, is a hotbed of data science and machine learning research. Microsoft has the resources to hire the best and brightest researchers from around the globe. A recent publication is available for download (PDF): “Deep Learning: Methods and Applications” by Li Deng and Dong Yu, two prominent researchers in the field.

Productionizing Hadoop: 7 Architectural Best Practices

Big Data will change the way your organization responds to business opportunities. But to reap its full benefits, you have to move from proof of concept into full production. Here is an informative, 52-minute presentation that provides the guidelines for successfully integrating Hadoop into your standard data center processes.

Data Science 101: Building Your Data Science Toolbox

Jeremy Howard made a presentation to the Melbourne R meetup group, where he gave a brief overview of his “data scientist’s toolbox” (using a few Kaggle competitions as practical examples), and also provided an introduction to ensembles of decision trees (including the well-known Random Forest™ algorithm).

Data Science 101: 250 Years of Bayes Theory


It’s been more than 250 years since the appearance of Bayes theorem (named after English statistician, philosopher and Presbyterian minister Thomas Bayes: 1701-1761), one of the two fundamental inferential principles of mathematical statistics.

Interview: Data Analytics and the Ubiquitous Internet of Things


We sat down with Cristian Borcea, PhD from the New Jersey Institute of Technology to discuss the IoT and Big Data applications. “New machine learning techniques could help us extract knowledge from these data – this happens especially for knowledge that we don’t expect and we don’t even know exists – we cannot search for something that we don’t know exists.”

Learning Data Science in Total Immersion


San Francisco based Zipfian Academy approaches data science education the way some approach learning a new language – total immersion. The company offers a 12-week intensive advanced data science training program in a modern lab environment.

Becoming a Data Scientist – What Does it Take?

I’ve been monitoring a curious and lively discussion over on LinkedIn – “Is it necessary to have a Masters Degree to become a data scientist?” The comments I’ve seen have exhibited a number of points of view on the matter that I think are reflective of the questions on many people’s minds – both those wanting to become a data scientist and those wanting to hire a data scientist.