Scalable Deep Learning Platform On Spark In Baidu

Print Friendly, PDF & Email

Baidu’s deep learning technology has made tremendous progress in achieving top results in various challenging tasks in computer vision, image processing, NLP, etc. In the era of big data, one integrated Spark platform using scalable deep learning training and prediction is of utmost importance, especially to Baidu scale. In the presentation below, Weide Zhang is a Senior Architect at Baidu, talks about his team’s work in using Spark to drive deep learning training and prediction using Paddle, the deep learning library developed by Baidu IDL. This enables multiple Baidu’s production offline processing do data ingestion, pre-processing, feature extraction and model training in one Spark cluster. He also addresses the resource heterogeneity to support multi-tenancy using Spark. Finally, he also shows some use cases and performance numbers.


For the convenience of our readers, here are the slides for the presentation:


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind