Cloudera, the provider of a leading platform for machine learning and advanced analytics built on the latest open source technologies, today unveiled Cloudera Data Science Workbench, a new self-service tool for data science on Cloudera Enterprise which is currently in beta.
Cloudera to Accelerate Data Science and Machine Learning for the Enterprise with New Data Science Workbench
EnterpriseDB® (EDB™), the database platform company for digital business, announced the general availability of a new version of the EDB Postgres Data Adapter for Hadoop with compatibility for the Apache Spark cluster computing framework. The new version gives organizations the ability to combine analytic workloads based on the Hadoop Distributed File System (HDFS) with operational data in Postgres, using an Apache Spark interface.
I recently caught up with Paulo Sampaio, Data Scientist at EDITED, to talk about applying machine learning, neural networks, natural language processing, and big data analytics to the retail industry. Paulo and his team are applying neural networks, machine learning and other models to analyze over 520 million products in real-time across 42 countries to make gradual distinctions in clothing styles, sizes and categories.
insideBIGDATA was on hand for the recent Spark Summit East 2017 conference in Boston, and one of the more compelling presentations was by Kavitha Mariappan, VP Marketing at Databricks. The talk focused on the premise that despite the tremendous growth and opportunities in big data today, women still play a small role in this arena.
EXASOL, a high-performance in-memory analytic database developer, and PATH, an international nonprofit organization and global leader in health and innovation, announced a partnership to support the Zambian government’s ambitious campaign to eliminate malaria by 2020.
Percipient, a Singapore-based startup, is launching a revolutionary solution to address the memory issues incurred by users of open source platform, Apache Spark. By delivering unified data a priori to the Spark platform, Percipient’s SparkPLUS solution is able to multiply the platform’s computing space, thereby greatly enhancing its utility for real time and analytical applications.
Research Firm Advises Analytics Stakeholders and Security Professionals to Build Plans for Securing Hadoop-based Assets
Dataguise, a technology leader in secure business execution, announced inclusion in a report by Gartner titled, “Rethink and Extend Data Security Policies to Include Hadoop.” The report provides best practices for addressing data security concerns related to Apache Hadoop deployments and highlights several leading vendors in the category to support these endeavors.
In this talk from Spark Summit East 2016, Prasad Chalasani explores some of the challenges that arise in setting up scalable simulations in a specific application, and share some solutions and lessons learned along the way, in the realms of mathematics and programming.
I recently caught up with Natalia Hernandez, Data Scientist at Foodpairing, to highlight how her company’s data scientists mine public online data, which gives general trend insights to use consumer intelligence and molecular analysis of ingredients to forecast the next big flavors in the food industry.
Splice Machine’s New OLAP Engine Adds Columnar Storage and In-Memory Caching to its Hybrid Relational Data Platform
Splice Machine, provider of the open-source SQL RDBMS powered by Apache Hadoop® and Apache Spark™, announced the release of version 2.5 of its industry-leading data platform for intelligent applications. The new version strengthens its ability to concurrently run enterprise-scale transactional and analytical workloads, frequently referred to as HTAP (Hybrid Transactional and Analytical Processing).