Sign up for our newsletter and get the latest big data news and analysis.

insideBIGDATA Latest News – 6/29/2020

In this regular column, we’ll bring you all the latest industry news centered around our main topics of focus: big data, data science, machine learning, AI, and deep learning. Our industry is constantly accelerating with new products and services being announced everyday. Fortunately, we’re in close touch with vendors from this vast ecosystem, so we’re in a unique position to inform you about all that’s new and exciting. Our massive industry database is growing all the time so stay tuned for the latest news items describing technology that may make you and your organization more competitive.

data.world Gives Enterprises a Deeper Understanding of their Data
with Automated Technical Lineage Capabilities 

data.world, the cloud-native enterprise data catalog company, released new data lineage capabilities that deliver a fully automated and comprehensive view of data relationships and how it flows through an organization.  Powered through a partnership with MANTA, users can more easily analyze and understand how data and information connect – including business and technical lineage of tables, columns, views, queries, transforms, additions, edits, and more. This granularity provides transparency and confidence in an organization’s data flow and consumption, ultimately improving impact and root-cause analysis, troubleshooting, and forecasting efforts. 

“People need to understand and trust the data powering their reporting and analytics. But assigning context to it, validating the origin and stewardship of data and analysis, and collaborating with other data users is challenging,” said Jon Loyens, co-founder and Chief Product Officer at data.world. “Adding technical lineage capabilities to our catalog allows users to get incredibly granular detail about data relationships, so they can feel more confident about how, when, and where they should be applying the data.”

New Actian Vector for Hadoop Enables Real-Time and Operational Analytics

Actian, a leader in hybrid cloud data warehousing and data integration, announced general availability for Actian Vector for Hadoop, its upgraded SQL database designed to provide high performance analytics and real-time data updates explicitly for Hadoop. Enabling real-time and operational analytics not previously feasible on Hadoop, the new Vector for Hadoop provides support for machine learning, optimized workload management and a seamless onramp to the cloud so enterprises can get the most out of their Hadoop investments.

Constraints on existing data lake infrastructures, combined with current market conditions and budgetary pressures, are preventing IT leaders from gaining the most value from their existing Hadoop data lake investments. With this challenge in mind, Actian developed Vector for Hadoop, making it possible for enterprises to take full advantage of the data sets at their disposal.

“Hadoop is designed for scale, but insights are needed at light-speed, which can mean the difference between business growth or failure,” said Emma McGrattan, SVP Engineering at Actian. “With that in mind, Actian Vector for Hadoop is making it possible for our customers’ existing Hadoop data lakes to take on new operational analytics challenges, which traditional Hadoop SQL applications have historically struggled to address.”

Pattern89 Announces Predict: AI that Simulates Ad Performance Before Campaigns Launch

Pattern89, an artificial intelligence marketing platform, announcds the end of A/B testing with its new solution, Pattern89 Predict. Pattern89 Predict simulates creative ad elements to predict a brand’s top-performing ad combinations with over 95% accuracy. Pattern89 Predict is the new standard for marketers, replacing lengthy and expensive A/B tests to determine an ad’s success before launching. Quickly upload hundreds of ad elements to Predict’s interface, and it will analyze millions of combinations to predict which specific ad components will perform best on social media. Predict unlocks creative insights for over 49,000 ad dimensions.

“Pattern89 Predict finds winning digital creative, before ads run, meaning A/B testing is now a thing of the past. Efficiency and accuracy are more important than ever, and we’re giving marketers a clear path forward, while removing opportunities for losing time and money,” said R. J. Talyor, CEO and Founder of Pattern89. “Marketing teams can’t afford to waste anything. Our up-front predictions give marketers confidence in their creative, and ultimately, their social ad returns.” 

MariaDB Gives the Power of Analytics to Millions for Free

MariaDB® Corporation announced the general availability of MariaDB Community Server 10.5, a major new release that brings high-performance analytics to the hands of millions using the popular open source database. In a push to mainstream analytics and to make it as popular as MariaDB’s transactional engine, the company added a new, native columnar storage engine to the community database server and a new, native MariaDB Python Connector and Microsoft Power BI integration. Together, these new capabilities provide traditional MariaDB users and data scientists around the world with powerful analytics. All new analytical capabilities in MariaDB Community Server 10.5 are available for free with unrestricted use to broaden adoption of hybrid transactional and analytical processing, and modern analytical approaches.

“Data analytics are a core component of any modern application,” said Gregory Dorman, vice president of distributed systems and analytics, MariaDB Corporation. “Customers expect applications to give them insights, historical comparisons, predictions and automation in order to make better, smarter decisions and deliver a more intuitive experience. With the new native, fully integrated analytical capabilities, MariaDB community users now have an out-of-the-box, cost-effective solution for storing and accessing massive amounts of data in split seconds. It will transform how everyone around the world views and uses MariaDB to start building modern everyday applications.”

Enterprises Turn to Alation for Data Governance

Alation Inc., a leader in enterprise data intelligence solutions, launched a series of initiatives to embrace data governance as a strategic use-case as part of the company’s broader vision to transform the data catalog into a platform for a broad range of data intelligence solutions including data search & discovery, analyst productivity, data governance, data stewardship, analytics, and digital transformation. The initiatives include product roadmap enhancements, upcoming and recently-announced strategic partnerships, a new Active Data Governance methodology, and a series of marketing programs.

“We’re doubling down on data governance in response to both requests from our customers and the realization that the data catalog is the ideal leverage point for applying data governance,” said Satyen Sangani, co-founder and CEO of Alation. “About one-third of Alation customers already leverage the data catalog as a platform for data governance. By working with these customers, we’ve determined how to drive our product strategy, our alliances strategy, our marketing, and our professional services to best embrace data governance as a strategic use-case, all as part of our overall vision to transform the data catalog into a platform for a broad range of data intelligence solutions.”

Xilinx Selects Mipsology Zebra Software to Accelerate Alveo U50 FPGA

AI software innovator Mipsology announced that its Zebra neural network accelerating software has been integrated into the latest build of Xilinx’s Alveo U50 data center accelerator card, the industry’s first low profile adaptable accelerator with PCIe Gen 4 support. Zebra’s ease-of-use and high throughput enable the Alveo U50 to compute convolutional neural networks with zero effort. This is the latest in a series of Zebra-enhanced Xilinx boards that enable inference acceleration for a wide variety of sophisticated AI applications. Others include the Alveo U200 and Alveo U250 boards.

“Zebra delivers the highest possible performance and ease-of-use for inference acceleration,” said Ludo Larzul, Mipsology’s founder and chief executive officer. “With the Alveo U50, Xilinx and Mipsology are providing AI application developers with a card that excels across multiple apps and in every development environment.”

Skillsoft Opens Data Wrangling with Python Bootcamp to All Learners

Skillsoft has announced its first ever Data Science Bootcamp delivered in partnership with Data Society, a leader in practical data science training, and hosted in Skillsoft’s intelligent learning experience platform, Percipio. Instructors from Data Society, who conducted more than 3,000 hours of training in 2019 with customers including NASA, Discover, and U.S. Department of State, will be leading the bootcamp. Available at no cost to all interested learners, the Data Wrangling with Python Bootcamp enables attendees to interact with the instructor and receive high-quality, personalized instruction while collaborating with others in a virtual classroom setting. 

Registration for Skillsoft’s four-day Data Wrangling with Python Bootcamp is open to anyone who is familiar with Python and will take place July 20-23. Available online and on learners’ time, all sessions will be recorded and available on-demand if unable to make a live session while all registrants also gain access to more than 7,000 courses with the 90-day Percipio trial. At a time when in-person trainings and events are halted, this Bootcamp provides massive value by delivering high quality personalized virtual instruction, which is critical to retain knowledge on such technical topics, with access to on-demand microlearning in a single solution.

“By partnering with Skillsoft, we are collectively breaking down barriers in data science education and democratizing acquisition of these increasingly essential skills,” said Merav Yuravlivker, CEO, Data Society. “We’ve crafted Skillsoft’s Data Wrangling with Python Bootcamp to ensure that learners receive a highly interactive experience and ample opportunities to apply practical programming skills that are foundational for data scientists to explore and analyze data.”

Olea Edge Analytics Releases EdgeWorks Platform 2.0, the Fastest, Most Accurate Way to Monitor Water Meter Performance

Olea Edge Analytics, an intelligent edge computing platform for the water utility industry, announced the release of EdgeWorks Platform 2.0, combining blockchain technology, AI and machine learning to provide the most advanced solution for water billing, delivery and conservation. This major product announcement includes a collection of new features that will help cities and water utilities find millions of dollars in revenue on broken, worn or incorrectly sized commercial water meters. With improved power management, a new bypass monitoring solution and a new pressure management solution, EdgeWorks Platform 2.0 notifies customers of an issue more quickly than ever and reduces the costs of data transit and storage. 

“Utility workers spend a lot of time and money identifying and diagnosing issues with their current set of tools,” said Dave Mackie, Olea Edge Analytics’ CEO. “EdgeWorks Platform 2.0 lets them refocus manpower and resources on fixing meters to return them to accuracy and allowing them to recover lost revenue.”

Pepperdata Announces Managed Autoscaling to Reduce Cloud Costs

Pepperdata, the leader in Analytics Stack Performance (ASP), announced managed autoscaling in the cloud with Pepperdata Capacity Optimizer version 6.3. While autoscaling provides the elasticity customers demand for their big data workloads, it can lead to runaway costs. Capacity Optimizer intelligently augments autoscaling to ensure all nodes are fully utilized before additional nodes are created, eliminating waste and reducing costs.

“Even with the best cloud migration strategy and dedicated attempts to curb costs, the cloud makes managing resources more difficult,” says Ash Munshi, CEO Pepperdata. “But, by leveraging machine learning and managing infrastructure in real time, IT operations teams automatically recapture wasted capacity and significantly reduce their costs.”

erwin Releases New Version of Industry-Defining Data Modeler to Support Digital Transformation, Cloud Migration and Infrastructure Modernization

erwin, Inc., the data governance company, announced the availability of the latest version of its data modeling solution, erwin Data Modeler (erwin DM). The update features new metadata-driven automation capabilities and facilitates moving legacy, premise-based data sources to modern cloud platforms to ensure proper data governance.

erwin DM provides metadata and schema visualization, a well-governed and integrated process for defining/designing data assets of all types, and centralization and integration of business and semantic metadata – all to accelerate data governance and increase enterprise data literacy and collaboration. Automated schema design and migration helps organizations adopt modern DBMS platforms and data warehouse architectures.

“As a result of COVID 19, businesses around the world are drastically stepping up their digital transformation efforts, including moving their legacy data to the cloud to ensure it’s more available for decision-making,” says erwin CEO Adam Famularo. ”So we continue to invest in the technology we pioneered to ensure customers can understand, design and deploy new data sources, plus support data governance and intelligence efforts, to further reduce data management costs and data-related risks, while improving the quality and agility of an organization’s overall data capability.”

Immuta Launches Native Offering for Databricks With Enhanced Security and Collaboration Features 

Immuta, the automated data governance company, announced an enhanced platform integration with Databricks, the data and AI company. Immuta for Databricks – a new, native offering for Databricks customers – enhances data engineering productivity and data security by automating fine-grained access control and privacy protection natively within Databricks and Delta Lake. Immuta’s latest release unlocks new data science opportunities and outcomes, simplifies regulatory compliance, and further enhances security and privacy controls.

“Databricks provides unmatched scalability, flexibility, cost savings, and performance. However, strict data protection rules and regulations create compliance and technical limitations when it comes to utilizing protected data for analytics and data science,” said Steve Touw, CTO, Immuta. “Data teams are exposed to new levels of risk, making it challenging to manage and prepare sensitive data for data scientists to access in a compliant, self-service way, but also for those same analysts to securely share and publish their work. The latest Immuta for Databricks enhances automation capabilities required to overcome these challenges.”

Yellowbrick Data Achieves Next Level of Scale for Hybrid Cloud Data Warehouses with Portfolio Expansion

Hybrid cloud data warehouse company Yellowbrick Data announced it has achieved the next level of scale for its customers by offering multiple petabyte (PB) capacity on its new hybrid data warehouse 3-chassis configuration. The most recent milestone in the evolution of the fastest, most performant hybrid cloud data warehouse in the industry, the 3-chassis configuration demonstrates Yellowbrick’s commitment to staying ahead of customer demands as businesses expand their infrastructure and workloads.

Yellowbrick’s 3-chassis product is an extension of the company’s high-performance family of integrated software/hardware solutions, which power a unique hybrid cloud data warehouse that can be consumed via any private cloud and/or any major public cloud. This evolution offers unparalleled, single-warehouse capacity with support for 3.6PB of user data in an 14U rack form factor. When fully populated, this instance has a maximum node count of 45 in 14U and also supports 45 concurrent, single-worker queries on one system. Other competitive solutions cost significantly more, particularly when data center cooling and real estate costs are taken into account.

“Yellowbrick is one of the only data warehouse providers to offer true horizontal scalability, eclipsing other competitive offerings in terms of performance, scale, and ease of extensibility,” said Nick Cox, head of product at Yellowbrick. “With the 3-chassis configuration, we’re delivering dramatically more storage and performance in a very small form factor. It’s the most recent example of our relentless commitment to innovation—something that’s been part of the company culture and our hybrid data warehouse since day one.”

Cambridge Semantics Introduces Geospatial Analytics within its AnzoGraph® DB

Cambridge Semantics, a leading provider of graph-driven data integration and analytics software, announced the addition of geospatial analytics within its award-winning AnzoGraph® DB. This new capability combines the power of scalable location analytics with the power of relationships and analytics in a graph database.

Many knowledge graph projects today include the need to know about people, things, and events in the real world and where they happened. Adding geospatial capabilities to AnzoGraph DB enables users to determine a location and its relationship to borders, regions, zones, or other places – and then help users perform calculations and queries about those locations and their relationships.

The use cases for geospatial analytics are varied and extensive. In the public sector, communities can use it for managing emergency services, mapping, city planning and predicting crime.  In financial services, geospatial is key for assessing risks and determining risk zones.  Geospatial is often a key enabler for many other business processes, including IoT projects, tax fee assessment, setting up delivery and sales zones, and optimizing routes. In addition, geospatial analytics can aid a variety of COVID-19 applications, including contact tracing.

“The integration of geospatial analytics within AnzoGraph DB is another example of how Cambridge Semantics continues to push the boundaries of knowledge graph innovation and to show what is possible with graph-based analytics,” said Steve Sarsfield, Vice President of Product, Cambridge Semantics. “Further, the extensive applications for geospatial analytics will help our partners and corporate innovation teams leverage geospatial for their large-scale, location-based intelligence applications.” 

Sigma Computing Unveils Major Updates to Cloud Solution, Powering Community-driven Analytics and Business Intelligence

Sigma Computing, an innovator in cloud-native analytics and business intelligence (A&BI), is powering a community-driven approach to A&BI with the latest version of its solution. The new interactive dashboards transform the A&BI process by enabling seamless collaboration between business users and data experts and allowing the domain experts to align A&BI efforts with business needs. Sigma’s new Application Embedding capability allows dashboards to be embedded into applications with inherited access roles and permissions, extending the value of Sigma throughout the entire data ecosystem safely and securely.

“Every company wants to be data-driven, but that is just a pipe dream until data exploration, analysis, and business intelligence are accessible to everyone,” said Rob Woollen, co-founder and CTO, Sigma Computing. “At Sigma, we believe the organizations that derive the most value from A&BI solutions are the ones that use them to work iteratively across business and technical teams. Unlike other solutions, Sigma gives everyone, not just data teams, the ability to explore, model, visualize, and enrich their data in real-time, at cloud-scale, so that they can find the answers in their data that drive real competitive advantage more quickly.”

The MLflow Project Joins Linux Foundation

The Linux Foundation, the nonprofit organization enabling mass innovation through open source, announced that MLflow, an open source machine learning (ML) platform created by Databricks, will join the Linux Foundation. Since its introduction at Spark + AI Summit two years ago, MLflow has experienced impressive community engagement from over 200 contributors and is downloaded more than 2 million times per month, with a 4x annual growth rate in downloads. The Linux Foundation provides a vendor neutral home with an open governance model to broaden adoption and contributions to the MLflow project even further.

“The steady increase in community engagement shows the commitment data teams have to building the machine learning platform of the future. The rate of adoption demonstrates the need for an open source approach to standardizing the machine learning lifecycle, ” said Michael Dolan, VP of Strategic Programs at the Linux Foundation. “Our experience in working with the largest open source projects in the world shows that an open governance model allows for faster innovation and adoption through broad industry contribution and consensus building.”

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: