Interview: Szilard Pafka, Data Scientist

Print Friendly, PDF & Email

Szilard_picThe Los Angeles data science Meetup scene is booming in large part due to the efforts of a local data scientist, Szilard Pafka. In the interview below, Szilard discusses his background in the field, the genesis of his many Meetup groups, the LA tech industry, and his plans to make his Meetups even more successful.

insideBIGDATA: Please introduce yourself to our readers. What is your background and how are you currently using data science?

Szilard Pafka: I received my formal education in Physics and later on in Computer Science and Finance (in the 1990s), while I have also self-learned Statistics and “Data Science” (it used to be called “Data Mining” at that time). I did a PhD at the intersection of these fields, and I also worked in risk management in the financial industry for a couple of years. In 2006 I moved to California to become the Chief Scientist for a credit card transaction processing company. I have been using data science (data analysis, data visualization, modeling/machine learning, etc.) mostly in this domain (for example for detecting credit card fraud, measuring/monitoring performance etc.) for quite a while (well before “data science” became a term used to describe this/became “cool”). You can find more details on my bio on my LinkedIn profile.

insideBIGDATA:  You’ve become the king of LA meetups. At last tally, you are the founder of 5 Meetup groups in the LA area: LA R users group, Data Science LA, DataVis LA, Python Data Science LA, and most recently LA Data Warehouse, BI and Analytics. Can you say a few words about your vision for all these technology groups?

Szilard Pafka: I got involved in meetups purely by coincidence. My main expertise is technical/analytical: working with data, understanding/learning things from data etc. I’m also not the typical extrovert community person, so in 2009 when I started the LA R meetup (which in retrospect was the very first Data Science meetup in Los Angeles) I did mainly so that I can deepen my knowledge of R by interacting regularly with those few who were using R at the time. Little did I know that I just started building a Data Science community in LA.

For a good while the data community was small, but later on other brave persons have started other data meetup groups (e.g. Machine Learning or Hadoop), and further down the road with the advent/hype of “big data” and “data science” the community kept growing. Not so long ago I realized that I had been spending a lot of time doing data visualization or thinking about how all these data tools and knowledge are put together in different companies, and here it goes, the DataVis LA and the Data Science LA meetups were born (the Data Science meetup is using the Machine Learning group’s meetup infrastructure). A survey of tools used by data scientists (it turns out it’s mainly R and Python) have led to the creation of the Python Data Science meetup, while the overlap with traditional business analytics (data warehouse, BI etc.) to the launch of the DW/BI meetup (the latter two with fellow co-organizer Eduardo Arino de la Rubia). I’d also like to mention my collaboration with Subash d’Souza of the LA Big Data meetups, with Joe Devon, a real pioneer of the LA Tech meetup scene, and with several other people (Rob Zinkov, Vaclav Petricek, Aaron Crow, Mikhail Lyukmanov and a few others) who’ve lead or are leading other data meetups.

insideBIGDATA: Your groups typically attract over a 100 attendees for each event. What you attribute to this success?

Szilard Pafka: I think a lot of this has actually to do just with the “data science” hype. For example starting 2 years ago whenever I used “data science” in the title of an event at the LA R meetup, the number of attendees doubled. Time to start a data science meetup group in name, right? I also hope that quality is a factor. In fact with 5 meetup groups now, I would still like to focus primarily on quality and not quantity, so even that there are more meetup groups (and others getting involved in organizing), I don’t think the total number of events we’ll be putting together will increase dramatically. I’m still doing this as a hobby besides my day job, and would like to keep focusing on actually doing data science.

insideBIGDATA: What’s your perspective of the LA technology scene centered about data science and big data?

Szilard Pafka:There are a lot of companies (startups, enterprises etc.) in LA for which data plays a central role. In fact due to so many tech startups, the Santa Monica/Venice area goes by the name of Silicon Beach. I expect this is only going to grow in many ways in the next few years. It also looks like tech meetups in general have started to play a larger and larger role not only in general networking, but also in companies using it as a primary avenue for hiring. This is true for the data science and big data meetups as well and you can find as attendees the whole spectrum from statistics graduate student to the very experts of the field. I have seen as a consequence increased interest from Tech companies and recruiters in getting involved with the meetups.

On a different cord, a couple of us volunteers started recently a website called DataScience.LA with the aim of becoming the main virtual meeting place for the LA data science community. It already features the content (e.g. slides, code, pictures, sometimes video recording) of the above mentioned meetups, it provides a venue for local data scientists to publish blog posts on various topics, and we also have other interesting content, please check it out.

With all this and much more (e.g. a number of excellent graduate students coming out from universities such as UCLA or USC) I think LA is a great place to start or develop further a carrier in data science. And of course we have a much more balanced way of life here than in Silicon Valley (check out the beach in Santa Monica).

insideBIGDATA: What do you have in store for your meetup groups in the future?

Szilard Pafka: There are several ideas (3 concrete!) that popped up lately either in my mind or in discussions with a few others at DataScience.LA. I don’t think I’m going to share them now, but look for some exciting things in the future. All right, here is one, I’d like to have another Panel on Data Science, a lot has changed since the last one we had. The other 2 ideas are really novel, stay tuned. To do so, you can follow me on Twitter, I strive for a high quality-very low volume Twitter feed with only Data Science related content. Can you guess my handle? Follow me on @DataScienceLA.

 

Sign up for the free insideBIGDATA newsletter.

 

 

 

 

Speak Your Mind

*