Sign up for our newsletter and get the latest big data news and analysis.

SparkCognition’s Darwin Machine Learning Platform Designed to Accelerate Data Science at Scale

As machine learning technology becomes more widely available on an enterprise scale, differentiating and studying which platform can be best for your business can be difficult.

Darwin machine learning
Download the full report.

A new white paper from SparkCognition explores one of the solutions on the market that works to accelerate data science at scale. Its Darwin machine learning platform is designed to automate the building and deployment of models.

According to SparkCognition, the tool works to provide “a productive environment that empowers data scientists with a broad spectrum of experience to quickly prototype and develop, tune and implement machine learning applications in less time.”

Using its automated model building capabilities, the system works to generate models using both supervised and unsupervised learning.

In it’s new report, SparkCognition tests the efficacy of its Darwin platform against three open-source products: AutoSklearn, H2O, and Random Forest, while using default parameters with each platform.

“Darwin greatly expedites the process of building models by cleansing the data, extracting features, and optimizing models, meaning companies can put these models to use, scale easily, and increase the speed to ROI.” — SparkCognition

Results were compared on six different datasets comprising each type of supervised learning problem, including:

  • Electric Devices Classification
  • UCI EEG Eye State Classification
  • UCI Ozone Classification
  • MNIST Digit Classification
  • Boston Housing Regression

Exploring the methodology of the test further, for time-series datasets, 80% of the data was used for training, and 20% was used for testing. Because not all of the comparison tools contained data cleaning methods, all categorical columns were tagged prior to model creation and in some cases encoded using one-hot encoding (Auto Sklearn only), the company explained. Each model was run for a maximum of 20 minutes.

Darwin machine learning
As machine learning technology becomes more widely available on an enterprise scale, differentiating and studying which platform can be best for your business can be difficult. (Photo: Shutterstock/By Peshkova)

The Darwin platform performed particularly well in the areas involving time-series data and data sets with non-linear relationships or other complex problems. SparkCognitions testing found that Darwin outperformed its competitors in four problem sets tested, and received comparable results in the remaining sets.

Download the full “Darwin Efficacy Report,” courtesy of SparkCognition, to learn more about its Darwin machine learning platform and its strengths.

Comments

  1. Patrick Bull says:

    Thanks for an interesting article about Darwin machine. I found something new and quite helpful! Good luck!
    Kind regards!

Leave a Comment

*

Resource Links: