A very hot topic in data science these days is the ability to discern “sentiment” by analyzing text from social media sources. In the talk below, Ryan Rosario presents some of his work at Facebook – “Sentiment Classification Using scikit-learn.” Ryan is a Quantitative Engineer at Facebook. Previously, I knew Ryan when he was in Los Angeles working at Riot Games and making presentations at the local R Meetup group. We miss Ryan and take this presentation as an opportunity to get up to speed with his current work in the field. The presentation was made at the recent PyData SV 2014.
Facebook users produce millions of pieces of text content every day. Text content such as status updates and comments tell us a lot about how people feel on a daily basis and how people feel about the web of things. In this talk, Ryan discusses a system based on scikit-learn and the Python scientific computing ecosystem that describes and models positive and negative sentiment of user generated content on Facebook. Unlike more traditional polarized word counting methodologies, his system trains machine learning classifiers using naturally labelled data to achieve high accuracy.