RapidMiner Makes Self-Service Advanced Analytics Available for Hadoop

Print Friendly, PDF & Email

PrintStrata + Hadoop World News

RapidMiner, an easy-to-use Modern Analytics platform, today announced significant updates to the most comprehensive advanced analytics offering on the market today. In a world where data lakes are often used solely as a repository for information, underutilized due to the state of the market and limits of technology, RapidMiner’s aggressive advances turn the tide for data scientists and business users alike to extract business value from Big Data.

Most analytics vendors extract data from Hadoop to build and score analytic models. Moving Big Data out of Hadoop reintroduces bottlenecks and increases complexity. Only a few analytics vendors push down analytics computation to Big Data in Hadoop. RapidMiner pushes the computation of more than 100 machine learning models directly to the data in the cluster, making it easy to deploy powerful predictive analytics into production inside Hadoop.

The new platform will be showcased in Booth #1421 at this week’s Strata + Hadoop World taking place through February 20 in San Jose, California. Attendees of the conference are encouraged to take the RapidMiner Challenge to build and deploy a machine learning model in-Hadoop in 10 minutes or less.

With our pushdown Hadoop processing in RapidMiner Radoop, combined with our recent announcement of RapidMiner Streams, it’s easy to see that we are quickly turning dormant data lakes into money-making machines where enterprises can maximize the business value from their data,” said RapidMiner CEO and Co-founder Ingo Mierswa. “Predictive analytics is no longer a nice-to-have competitive advantage. It’s an absolute business necessity. Nobody else offers what RapidMiner does, and our latest release establishes us as the de facto modern analytics platform.”

RapidMiner is the only code-free advanced analytics platform available commercially that can execute analytical processes in-memory, in-Hadoop, in-Cloud, in-Stream and in-database.

New in-Hadoop Model Scoring Delivers Up to 20x in Performance Compared to Legacy Hadoop Model Scoring

Many companies are still deterred by the complexity of building analytics applications on a complicated big data technology stack,” said Nik Rouda, senior analyst at ESG. “RapidMiner is differentiated both by offering a solution that is very deep and yet still user-friendly, attributes which will enable faster development in a wide range of environments.”

RapidMiner Radoop, which automatically creates an optimized analytic execution plan based on the unique Hadoop cluster configuration, now integrates machine learning algorithms from MLlib, Apache Spark’s machine learning library. This RapidMiner Radoop release includes push down processing for logistic regression and decision tree algorithms that can be trained natively in Hadoop, making use of the full distributed computation power of Spark in a Hadoop cluster.

RapidMiner Authenticates Data with Kerberos Security

Data security is top of mind for enterprises worldwide. This crucial business requirement typically delays analysis, but not with RapidMiner. RapidMiner Radoop authenticates data upon ingest, making it easy to perform large-scale data exploration, model building and model scoring.

RapidMiner Takes Self-Service Analytics to Next Level with Guided Analytics

Wisdom of Crowds

RapidMiner continues to differentiate itself from other advanced analytics providers by offering a guided approach to building predictive analytics based on the wisdom gleaned from the 250,000 member strong RapidMiner community. The analytic best practice, or wisdom from the crowd, is mined via RapidMiner machine learning to recommend how to best build a predictive model. As users are always experimenting and learning, the latest innovations happening in the community are offered up as recommendations. This unique feature, which leverages the power of the RapidMiner community to create recommendations, makes it easier to develop more accurate predictive models, no matter how sophisticated the end user may be.

Context-Aware Recommendations

This new release now includes context-aware recommendations, resulting in more relevant and focused guidance. This context awareness gives RapidMiners better understanding of how other users are solving similar problems, and by tapping into that wisdom, offers up better recommendations that accelerate their time-to-value.

Parameter Recommendations

The new platform also now includes recommendations for parameter settings. Tuning parameters is critical when developing analytic work flows and is a notoriously tedious and difficult task, especially for beginners. The RapidMiner parameter recommender uses the knowledge and experience of the community to recommend and fine tune parameters which improves the model accuracy and results.

This release signals a major leap forward for Hadoop users and our total RapidMiner community. Spark is lightning fast compared to other native Hadoop analytics,” said RapidMiner President and COO Michele Chambers. “Our approach to creating optimized analytic execution plans for Hadoop is unique in the market and reflects the reality of the variety of Hadoop clusters used by customers. We believe that every customer should ‘have it their way’ and still easily get value out of Big Data. Now, we’ve made it so. And we made it easy.”


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind