Predibase Introduces a New Way to Do Low-Code Machine Learning

Print Friendly, PDF & Email

Predibase emerged from stealth with its commercial platform that lets both data scientists and non-experts develop flexible, state-of-the-art ML with best-of-breed ML infrastructure. Predibase has been in beta with Fortune 500 enterprises the last nine months who have seen time for model development deployment drop from months to days and used by data practitioners and engineers across each organization. 

“We’ve seen a burst of innovative ideas to break the logjam of coding needed to build, train and operationalize ML models,” said Kevin Petrie, Vice President of Research at Eckerson Group. “Predibase proposes a distinct approach: enable data scientists to have the code work for them. They tell Predibase what they need with a few lines, and let the platform handle the details.”

Predibase’s founders saw the pain of getting ML models developed and in-production, taking up to a year even at leading tech companies like Uber, so they built internal platforms that drastically lowered the time-to-value and increased access. The key was taking a “declarative approach” to ML, which Piero Molino (CEO) introduced with Ludwig, an open source framework to create deep learning models with more than 8,200 GitHub stars, more than 100 contributors and thousands of monthly downloads. With Ludwig, tasks that took months-to-years were handed off to teams in thirty minutes and just six lines of human-readable configuration that can define an entire ML pipeline.  

While at Uber, Molino worked with Travis Addair (CTO), who led the team behind Horovod, an open source framework to efficiently scale and distribute deep learning model training to massive amounts of data. Horovod’s run across industry at scale, including on petabyte scale data across 27,000 GPUs on IBM Summit – the world’s largest supercomputer. Horovod has more than 12,300 GitHub stars, more than 100 contributors and 60,000 monthly downloads.

Molino and Addair along with Devvret Rishi (Chief Product Officer), the first Google AI PM for Kaggle – an open data science platform with more than six million users, and Professor Chris Ré, creator of Overton, a proprietary system similar to Ludwig at Apple, founded Predibase to create the easiest platform to enable organizations to adopt state-of-the-art ML.

“At Predibase, we want to bring the same philosophy to we used at Uber and Apple to all AI and Data organizations by building a scalable enterprise-level product that is very easy to use and operationalize through its declarative interface and saves companies time and money,” said Piero Molino, co-founder and CEO of Predibase. “We want to make doing ML as easy as doing analytics.”

Predibase, built on top of Ludwig and Horovod, is bringing the declarative approach to ML to market. Most tools today force users to choose between flexibility and simplicity, as organizations struggle to decide between hiring scarce and expensive experts to build highly complex solutions in-house or using low-code/no-code tools that provide limited user control and have a low ceiling.

Declarative ML systems bridge the gap, by allowing users to specify ML models as “configurations” – simple files that tell the system what a user wants and let the system figure out the best way to fill that need. For example, with this approach engineers could create a state-of-the-art text classification model in just six lines of configuration specifying inputs and outputs. However if they wanted to iterate and customize that model, they can simply add additional parameters in the configuration that gives them a granular level of control right to the level of the code.

Predibase marries this innovative approach with best-in-class ML infrastructure to create an end-to-end platform that can go from data to deployed model in an enterprise. There are three key user journeys in the platform:

  1. Connect data – it integrates with popular sources like Snowflake, Google BigQuery, S3, GCS, Delta Lake and many more, so users can connect structured and unstructured data wherever it may be.
  2. Declaratively train models – users can train state-of-the-art models like BERT, GPT-2, Tabnet, VIT and more on huge amounts of data through the platform or programmatically in a few lines for any ML use case. 
  3. Operationalize models – models can be deployed in one click and accessed via REST endpoints, through a Python SDK, or through PQL – a proprietary extension to SQL that puts ML in the hands of anyone who can write a “SELECT” statement.

With these three key aspects, engineers and data practitioners can build and deploy cutting-edge ML in their organizations seamlessly. 

Predibase Raises $16.25 Million in Seed and Series A Funding 

Predibase has raised a total of $16.25 million with Greylock leading its seed and series A, with participation from the Factory and a conglomerate of high profile angel advisors like Zoubin Ghahramani (Professor of Information Engineering at Cambridge and Sr Director of Research at Google), Anthony Goldbloom (CEO of Kaggle), Ben Hamner (CTO of Kaggle), Remi El-Ouazzane (former COO of Intel AI), Varun Badhwar (former SVP of Cloud Security at Palo Alto Networks) and Yi Wang (Principal Engineer at Meta).

Greylock Partner Saam Motamedi said: “Predibase is building the first declarative ML platform that enables enterprises to develop and operationalize models, from data to deployment, without having to choose between simplicity and the power of fine-grained controls. The rapid success of both the open source foundations and the beta of its commercial platform in the Fortune 500 has been incredibly exciting. We’re thrilled to partner with this AI leadership team, who have built both platforms and communities in cutting-edge ML, as they drive the next wave of ML adoption and democratize access.”

Predibase will use the money to build our engineering and ML talent, invest more heavily in go-to-market and bring the platform to GA.

Additional Resources

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 –

Speak Your Mind