Sign up for our newsletter and get the latest big data news and analysis.

StreamSets Launches StreamSets Transformer

StreamSets, Inc., provider of the DataOps platform for modern data integration, released StreamSets® Transformer, a simple-to-use, drag-and-drop UI tool to create native Apache Spark applications. Designed for a wide range of users — even those without specialized skills — StreamSets Transformer enables the creation of pipelines for performing ETL, stream processing and machine-learning operations. Now, data engineers, scientists, architects and operators gain deep visibility into the execution of Apache Spark while broadening usage across the business.

Addressing Governmental Challenges when Engaging AI, ML and Data Analytics

Gartner recently stated that all industries and levels of government agree the top three game-changing technologies today are AI/machine learning, data analytics/predictive analytics and cloud technologies. However, there are some primary sticking points when it comes to innovation in these areas. Government organizations continue to encounter challenges when trying to pursue these initiatives due to complex security and compliance requirements, poor scalability of legacy IT infrastructure, and perceived risks associated with cloud and IT modernization efforts. How can these challenges be addressed?

The Future of Open Source Big Data Platforms

Three well-funded startups – Cloudera Inc., Hortonworks Inc., and MapR Technologies Inc. — emerged a decade ago to commercialize products and services in the open-source ecosystem around Hadoop, a popular software framework for processing huge amounts of data. The hype peaked in early 2014 when Cloudera raised a massive $900 million funding round, valuing it […]

Informatica Announces Enterprise Data Catalog Integrations With Microsoft, Tableau, and Databricks

Informatica, the enterprise cloud data management leader, announced the industry’s most comprehensive enterprise-scale intelligent data catalog, enhanced with technology innovations and tight strategic-partner integrations. The Informatica® Enterprise Data Catalog (EDC) creates a “catalog of catalogs” with AI-driven data discovery across multi-cloud and hybrid environments, providing broad metadata connectivity to support organizations in driving their data-driven digital transformations.

Databricks Open Sources Delta Lake for Data Lake Reliability

Databricks, a leader in Unified Analytics and founded by the original creators of Apache Spark™, announced a new open source project called Delta Lake to deliver reliability to data lakes. Delta Lake is the first production-ready open source technology to provide data lake reliability for both batch and streaming data. This new open source project will enable organizations to transform their existing messy data lakes into clean Delta Lakes with high quality data, thereby accelerating their data and machine learning initiatives.

Data and AI Experts Share Predictions with Databricks: What the Future Holds for AI, Big Data and Analytics

Databricks, a leader in unified analytics and founded by the original creators of Apache Spark™, sees 2019 as the year that more companies solve the world’s toughest data problems that have hindered AI initiatives across industries. This perspective is shared by data thought leaders who advise on AI, big data and analytics trends that inspired them in 2018, and those on the horizon for 2019.

New Guide Offers Databricks Unified Analytics Platform Machine Learning Use Cases

The fields of machine learning and deep learning are on the brink of unprecedented breakthroughs across a variety of verticals. And according to a new report from Databricks, “data is the new fuel,” for these market advancements. Download the new white paper today, “Four Real-Life Machine Learning Use Cases,” to explore Databricks Unified Analytics Platform use cases in the advertising, loan servicing, media industries and more.

Using Unified Analytics & Big Data as Path to AI Success

How can modern enterprises unlock the potential of AI to change their business? Today’s businesses and enterprises are increasingly focused on big data that can help drive innovation and transformation through the potential of artificial intelligence. According to a survey and research report commissioned with IDG’s CIO, nearly 90 percent of enterprises are investing in data and AI technology. Download the new report, “Unified Analytics for Dummies,” that explores the steps to AI success in today’s market.

Databricks and RStudio Introduce New Version of MLflow with R Integration

Databricks, a leader in unified analytics and founded by the original creators of Apache Spark™, and RStudio, today announced a new release of MLflow, an open source multi-cloud framework for the machine learning lifecycle, now with R integration. RStudio has partnered with Databricks to develop an R API for MLflow v0.7.0.

Databricks Partners with RStudio To Increase Productivity of Data Science Teams

Databricks, a leader in unified analytics and founded by the original creators of Apache Spark™, announced a partnership with RStudio, providers of a free and open-source integrated development environment for R, to increase the productivity of data science teams. The partnership will allow the two companies to seamlessly integrate Databricks’ Unified Analytics Platform with the RStudio Server, simplifying R programming on big data.