insideBIGDATA Latest News – 11/21/2022

Print Friendly, PDF & Email

In this regular column, we’ll bring you all the latest industry news centered around our main topics of focus: big data, data science, machine learning, AI, and deep learning. Our industry is constantly accelerating with new products and services being announced everyday. Fortunately, we’re in close touch with vendors from this vast ecosystem, so we’re in a unique position to inform you about all that’s new and exciting. Our massive industry database is growing all the time so stay tuned for the latest news items describing technology that may make you and your organization more competitive.

TigerGraph Delivers Graph to All with Latest Cloud Offering; New Visualization and Machine Learning Features Simplify Graph Technology Adoption for Deeper Business Insights

TigerGraph, provider of a leading advanced analytics and ML platform for connected data, announced the latest version of TigerGraph Cloud, the native parallel graph database-as-a-service, highlighted by two powerful new tools for visual graph analytics and machine learning. TigerGraph Insights, an intuitive visual graph analytics tool for users to search and explore meaningful business insights, and ML Workbench, a powerful Python-based framework to accelerate the development of graph-enhanced machine learning applications, are available today to TigerGraph Cloud users.

TigerGraph has long been committed to both democratizing graph and pushing the limits of industry innovation. Our latest release of TigerGraph Cloud does both, helping developers and data scientists unlock the full potential of their data,” said Jay Yu, vice president of product and innovation at TigerGraph. “The addition of visual graph analytics and machine learning tools to our fully managed graph database-as-a-service offering — which is available on all major cloud platforms — lowers the barrier to graph entry even further. Now, enterprises of all sizes can supercharge their data analytics and machine learning projects at scale with speed, asking and answering critical business questions that move the needle.

Cerebras Unveils Andromeda, a 13.5 Million Core AI Supercomputer that Delivers Near-Perfect Linear Scaling for Large Language Models

Cerebras Systems, a pioneer in accelerating artificial intelligence (AI) compute, unveiled Andromeda, a 13.5 million core AI supercomputer, now available and being used for commercial and academic work. Built with a cluster of 16 Cerebras CS-2 systems and leveraging Cerebras MemoryX and SwarmX technologies, Andromeda delivers more than 1 Exaflop of AI compute and 120 Petaflops of dense compute at 16-bit half precision. It is the only AI supercomputer to ever demonstrate near-perfect linear scaling on large language model workloads relying on simple data parallelism alone.

With more than 13.5 million AI-optimized compute cores and fed by 18,176 3rd Gen AMD EPYC™ processors, Andromeda features more cores than 1,953 Nvidia A100 GPUs and 1.6 times as many cores as the largest supercomputer in the world, Frontier, which has 8.7 million cores. Unlike any known GPU-based cluster, Andromeda delivers near-perfect scaling via simple data parallelism across GPT-class large language models, including GPT-3, GPT-J and GPT-NeoX. 

Near-perfect scaling means that that as additional CS-2s are used, training time is reduced in near perfect proportion. This includes large language models with very large sequence lengths, a task that is impossible to achieve on GPUs.  In fact, GPU impossible work was demonstrated by one of Andromeda’s first users, who achieved near perfect scaling on GPT-J at 2.5 billion and 25 billion parameters with long sequence lengths — MSL of 10,240. The users attempted to do the same work on Polaris, a 2,000 Nvidia A100 cluster, and the GPUs were unable to do the work because of GPU memory and memory bandwidth limitations.

Neo4j Announces General Availability of its Next-Generation Graph Database Neo4j 5

Neo4j®, a leader in graph technology, announced the general availability of Neo4j 5, the next-generation cloud-ready graph database. Neo4j 5 widens the performance lead of native graphs over traditional databases while providing easier scale-out and scale-up across any deployment, whether on-premises, in the cloud, hybrid, or multi cloud. The result empowers organizations to more quickly create and deploy intelligent applications at large scale and achieve greater value from their data.

Graph technology adoption is accelerating as organizations seek better ways to leverage connections in data to solve complex problems at scale,” said Emil Eifrem, CEO and Co-founder of Neo4j. “We designed Neo4j 5 to deliver the type of scalability, agility, and performance that enable organizations to push the envelope on what’s possible for their data and their business.

Fauna Adds Intelligent Routing To Globally Distributed Serverless Database

 Fauna, the distributed document-relational database delivered as a cloud API, announced Intelligent Routing, providing developers a single endpoint to access any database, anywhere across Fauna’s global footprint. All applications built on or migrated to Fauna utilize Intelligent Routing to scale applications globally while staying compliant with data residency requirements. This powerful new offering determines the most efficient way to route requests and queries to databases across geographies and cloud providers. 

Developers face challenges navigating data sovereignty, security, and consistency, especially as their applications scale across regions. Typically, addressing each requires manual intervention immediately and over time, adding costs and decreasing productivity. Fauna eliminates this heavy-lifting. The net result is that developers can spend more time creating and innovating on their applications, easily scale locally or across regions, and satisfy data residency requirements. 

Thousands of development teams around the world have picked Fauna based on our unique document-relational model and low operational overhead,” said Eric Berg, CEO of Fauna. “Intelligent Routing enables us to offer developers a single endpoint to access any database across Fauna’s global footprint. This makes it easy for teams to eliminate any data-related friction as they scale applications across regions and the globe. 

Scality ships RING9, software for unbreakable hybrid-cloud data storage

Scality announced RING9 — the ninth generation of its leading RING scale-out file and object storage software — a solution that allows IT teams to build and run a modern hybrid-cloud data storage infrastructure with higher performance and efficiency. RING9 is built on major investments in Scality’s flagship RING solution that enables IT teams to:  

  • Fully leverage flash media through tiering and dynamic data protection policies.
  • Modernize monitoring stack with Prometheus tools and APIs
  • Streamline integration with API extensions to ecosystem partners such as Veeam and VMware Cloud Director (VCD)

With these new capabilities, RING9 further enhances and simplifies scale-out file and object storage for enterprises building private and hybrid cloud storage services with comprehensive AWS S3 and IAM compatibility.

As IT teams embrace the modern stack architecture, they need solutions that eliminate challenges in enterprise data management and storage in the hybrid cloud,” said Paul Speciale, chief marketing officer at Scality. “Scality RING9 represents a major step change for the entire storage industry. Users gain improvements in storage efficiency through internal flash-to-disk tiering and dynamic data protection policies. For modern cloud-based data centers, RING9 fits naturally into the monitoring and observability ecosystem with support for Prometheus and Elastic Cloud. RING9 expands the addressable market and use-case workloads for RING further into the high-performance arena.

Launch of SoundHound Dynamic Interaction Marks Groundbreaking New Era For Human-Computer Interaction

SoundHound AI, Inc. (Nasdaq: SOUN), a global leader in voice artificial intelligence, introduced Dynamic Interaction™, a category-level breakthrough in conversational AI that raises the bar for human-computer interaction by not only recognizing and understanding speech, but also responding and acting in real-time. Where existing voice technology requires wake words and relies on turn-taking with awkward pauses to process requests, Dynamic Interaction uses the twin technologies of fragment parsing – which breaks speech down to partial-utterances and processes them in real-time – and full-duplex audio-visual integration to create an instantaneous, next-generation experience. 

As the Dynamic Interaction demo shows, this technology is incredibly user-friendly and precise. Consumers won’t have to modify how they speak to the voice assistant to get a useful response – they can just speak as naturally as they would to a human. As an added bonus they’ll also have the means to instantly know and edit registered requests,” says Keyvan Mohajer, Co-Founder and CEO of SoundHound.  “In our 17 year history of developing cutting-edge voice AI, this is perhaps the most important technical leap forward. We believe, just like how Apple’s multi-touch technology leapfrogged touch interfaces in 2009, this is a significant disruption in human-computer interfaces.

New Study Reveals Content Governance is a Top Priority for Organizations Managing Critical Data

Rocket Software, a global technology leader that develops enterprise software for some of the world’s largest companies, released its 2022 Survey Report: Content Management – The Movement to Modernization based on a survey of over 500 corporate IT and line of business professionals across multiple industries in the United States, United Kingdom and Asia-Pacific regions. The report revealed that content and system security are paramount when it comes to content management, with 60% of respondents citing it as the most important feature in a content management solution. The findings also highlight the power and effectiveness of integrated automation to manage data that is not easily quarried or organized.

Failure to effectively manage content poses a great risk to organizations not prepared to handle vast amounts of data, which must be handled in a secure and compliant manner,” said Chris Wey, President, Data Modernization, Rocket Software. “Organizations need to at once be able to reap the most value from their data and ensure they are compliant with the ever-changing regulatory market—robust content management solutions are the answer.

Rockset Achieves 84% Faster Performance For Real-Time Analytics With Intel Xeon Scalable Processors

Rockset, the Real-time Analytics Database Built for the Cloud, unveiled a new release that leverages 3rd Gen Intel® Xeon® Scalable processors with built-in AI accelerators, and also announced a strategic collaboration with Intel. Now part of the Intel Disruptor Program, Rockset works closely with Intel for bi-directional roadmap alignment to build solutions that deliver best-in-class price-performance to customers, empowering organizations to scale real-time analytics efficiently in the cloud.

Rockset’s real-time analytics database is built for sub-second analytics on streaming data. Hundreds of modern data applications, including personalization engines, logistics tracking, game monetization, anomaly detection and IoT applications are powered by Rockset. Built for the cloud-native era, Rockset achieves maximum compute efficiency by leveraging its Converged Index™ and by separating compute and storage in the cloud. With Rockset, data and engineering teams can iterate faster and more efficiently scale their data applications in the cloud.

When compared to cloud data warehouses that are designed for business intelligence workloads, Rockset offers faster query performance at lower compute cost, because it is designed for developers building data applications,” said Venkat Venkataramani, co-founder and CEO of Rockset. “The 3rd Gen Intel Xeon Scalable processor has enabled us to push the limits of our real-time analytics database, providing customers up to 84% more throughput for data applications.

Fractal Announces Launch of 

Fractal, a global provider of artificial intelligence and advanced analytics solutions to Fortune 500® companies, announced the launch of Building on the company’s existing AI capabilities, is the only purpose-built interconnected AI solution for consumer goods, manufacturing, and retail today. 

Businesses are as dynamic and complex as they have ever been in the wake of the Covid-19 pandemic and subsequent digital transformation surge. This means that consumer goods, manufacturing and retail brands are having to rely more heavily on their technology to unlock value than ever before – with the AI retail market alone set to hit $31 billion by 2028. However, because of the immense fragmentation that exists across the AI ecosystem, businesses in the retail and CPG categories are unable to drive the business impact they are looking for from their technology stacks. tackles these silos head-on by providing the only end-to-end AI platform, unifying demand planning, sales and distribution, inventory planning and pricing and promotion – all under one umbrella.

Business success today is defined by how quickly and seamlessly brands are able to make decisions,” said Mohit Agarwal, CEO, “Unfortunately, brands – especially in consumer-packaged goods – find their efforts constantly undermined by disconnected technology that inhibits their success, not empower it. Without interconnectedness, the future AI technologies promise, since they debuted decades ago would still be distant, instead of right here, right now. looks to solve these challenges by driving interconnectedness through its autonomous decisioning platform.

Alluxio Reimagines Architecture for Multi-Tenant Environments at Scale

Alluxio, the developer of the open source data orchestration platform for data driven workloads such as large-scale analytics and AI/ML, announced the immediate availability of version 2.9 of its Data Orchestration Platform. This new release strengthens its position as the key layer between compute engines and storage systems by delivering support for a scale-out, multi-tenant architecture with a new cross-environment synchronization feature, enhanced manageability with significant improvement in the tooling and guidelines for deploying Alluxio on Kubernetes, and improved security and performance with a strengthened S3 API and POSIX API.

Tenant-dedicated satellite clusters have become more common while architecting data platforms,” said Adit Madan, Director of Product Management, Alluxio. “Alluxio’s ability to actively synchronize metadata across multiple environments is significant, making the adoption of such an architecture easier than ever.

Redpanda Brings the Fastest, Most Resource-Efficient Apache Kafka® API to the Cloud

Streaming data pioneer Redpanda launched the general availability of its Redpanda Cloud managed service. Redpanda Cloud delivers 10x faster tail latencies than other streaming data platforms and easily scales to tens of gigabytes per second. Complete with developer tools, connectors and SOC 2 Type 1 certification necessary to run mission-critical Kafka workloads at scale, Redpanda Cloud also comes with an innovative cloud-first storage capability that significantly reduces data storage costs.

At Redpanda we’ve been all about solving the limitations of legacy streaming data platforms,” said Alex Gallego, founder and CEO of Redpanda. “We started by reducing the complexity of these distributed systems, then tackled the inconsistent performance that plagues Java-based approaches. With our new cloud-first storage we’ve solved the high costs, which were perhaps the biggest barrier to adopting streaming data. Now, with Redpanda Cloud we bring all our innovations together in an easy-to-use fully managed cloud service that is both faster and less expensive for high-throughput use cases.

Datadobi’s Latest StorageMAP Release Helps Companies Do Much More With Object Storage

Datadobi, a leader in unstructured data management, announced enhancements to its multi-vendor, multi-cloud unstructured data management platform StorageMAP. The 6.3 release introduces the ability to copy network-attached storage (NAS) data to any S3-compatible object storage system. The new file-to-object copy functionality adds to StorageMAP’s ability to help IT leaders archive, pipeline, and replicate file data to S3.

We live in an unstructured data-driven world. While a valuable asset to a business, the sheer magnitude and growth of unstructured data across disparate data storage estates brings risk and often wholly unnecessary costs for companies,” said Carl D’Halluin, CTO at Datadobi. “The new additions to StorageMAP allow IT leaders to manage more of their unstructured data across on-premises and the cloud, giving companies the ability to make quick and accurate informed decisions about their data and where it is stored.

Aunalytics Introduces Enterprise Analytics as a Managed Service for Enterprises in Secondary and Tertiary Markets

Aunalytics, a leading data management and analytics company delivering Insights-as-a-Service for mid-market businesses, announced it has initiated Enterprise Analytics, a managed service comprised of experts in data analytics, data engineering, artificial intelligence (AI) and machine learning, coupled with the tools and technology required to help enterprise businesses accomplish their objectives and achieve success.

We understand the challenges that enterprises in secondary and tertiary markets face when striving for digital transformation, which is a requirement for all organizations who want to remain competitive and thrive in today’s business environment,” said Katie Horvath, Chief Marketing Officer, Aunalytics. “Organizations partner with Aunalytics for both the technology and expertise they need to realize value from their data. We extend their in-house teams so they have the consistent long-term expertise it takes to achieve ROI and business outcomes from data analytics.

NNAISENSE Launches ARC, An Artificial General Intelligence (AGI) Platform, Marking the Beginning of Industry 5.0

Swiss AI firm NNAISENSE has announced the launch of Adaptive Rational Core (ARC), an Artificial General Intelligence (AGI) solution for industrial manufacturing, logistics and smart cities. 

Industrial AI is currently limited by the simplicity and narrow scope of automation, which means that factory downtime for reconfiguration is an increasing cost for companies dealing with frequent change and supply chain challenges. Additionally, the optimization process of low-level automated machinery must be overseen by expert personnel which is expensive and time consuming. 

ARC is capable of automating automation, and can therefore model and adapt an entire automated system to meet new operational goals. In addition to rapid on-the-job data-collection, greater operational efficiency and actionable insights which can be shared with stakeholders, ARC enables industrial manufacturers to adapt quickly to change at much lower costs – eliminating factory downtime.

ARC is based on developmental cognitive robotics, meaning that it learns step by step. Starting with little to no knowledge, ARC continually learns from experience, maintaining a white-box model of a plant, city, or any system, which it interprets for predicting and planning. ARC performs work on command: it can be re-tasked at any time without the downtime of offline re-training, verification, and redeployment. ARC’s use cases include industrial automation engineering, logistics (including transportation), and smart cities – but can be adapted to work in other use cases as well.

When it comes to industrial automation, the impact of changes on the plant are learned on the fly and do not incur costly reprogramming, re-training or factory downtime,” said Bas Steunebrink, Co-founder & Director of Artificial General Intelligence at NNAISENSE. “ARC is not only a huge efficiency driver for a variety of industries, but also marks a move away from narrow AI that has dominated the industrial landscape, towards applied AGI and Industry 5.0 – resulting in highly scalable autonomous systems that are closer to human beings in terms of common sense, with a continually-deepening understanding of the processes under their control. 


Unravel Data, the DataOps observability platform built to meet the needs of modern data teams, announced the general availability of its 2022 Fall Release of the Unravel Platform. With this new release, users of the Unravel Platform are now able to leverage several new capabilities including support for Google Cloud BigQuery and Cost 360 for Amazon EMR. These new capabilities are designed to help users boost the efficiency of their public cloud spend, simplify troubleshooting across their big data ecosystem, and improve the overall performance of their business-critical data applications.

Data teams have a clear mandate to ensure that the data pipelines that support their data analytics programs are fully optimized, running efficiently and staying within budget. However, given the complexity of their data ecosystem, getting answers about the health and performance of their data pipelines is harder than ever,” said Kunal Agarwal, founder and CEO of Unravel Data. “Whether they’re migrating more workloads to platforms like BigQuery or Amazon EMR or already running it as part of their data ecosystem, enterprise data teams are struggling to control costs and accurately forecast their resource requirements that are ultimately impacting their ability to execute on their strategic data analytics initiatives. With this latest edition, Unravel customers will be better able to gain the full-stack observability they need to optimize performance and manage their costs according to budget.

Credo AI Announces New Capabilities to Bring Transparency Through Comprehensive Assessment and Reporting

Credo AI, the governance company operationalizing Responsible AI, today announced the general availability of new assessment and reporting capabilities in its Responsible AI Governance Platform. These enhancements will enable enterprises to easily meet new regulatory requirements and customer demands for governance artifacts, reports and disclosures on their development and use of AI, with a focus on assessing and documenting Responsible AI issues like fairness and bias, explainability, robustness, security, and privacy. This release is the latest addition to Credo AI’s software that helps enterprises manage AI risk and compliance at scale. The new feature set allows organizations to standardize and automate reporting of Responsible AI issues across all of their AI/ML applications. 

Credo AI is building the governance layer that will empower organizations to ensure that all of their internal and third-party AI is meeting business, regulatory and ethical requirements,” said Navrina Singh, founder and CEO of Credo AI. “This product release is the next step in our journey to bringing context focused governance and accountability to AI. Not only will this solution help companies bring their AI into compliance, but also ensures that their AI is working in alignment with human-centered values.

PlanetScale Boost Solves Cache Invalidation – ‘One of the Two Hardest Problems in Computer Science’ – Letting Users Accelerate Specific Queries by More Than 100X 

PlanetScale, the serverless database innovator powered by MySQL and Vitess, announced PlanetScale Boost, a new product that improves query performance by more than 100X and eliminates the need for external database caching. Developed at PlanetScale in less than six months, Boost adds to the competitive advantage that customers have when they build on top of PlanetScale, eliminating weeks of infrastructure and custom application work.

Unlike other platforms that leverage off-the-shelf open source technology or build complex infrastructure to mimic this effect, Boost is novel technology based on cutting-edge research applied to solving real-world problems. Sitting next to, but apart from, the database, it operates transparently with no transactional overhead and none of the risk to stability of the platform that traditional caches introduce.    

As a wise person once said, ‘There are only two hard things in Computer Science: cache invalidation and naming things.’ The reason this has never been done before is because query caching is very difficult,” said Nick van Wiggeren, vice president of engineering for PlanetScale. “PlanetScale Boost is the first cache you don’t have to invalidate, because it’s not technically a cache. PlanetScale materializes the data for boosted queries in a data structure that lets us break down the query into the fundamental pieces, allowing us to keep and update the data that you use right in memory to serve parts of those queries very, very quickly.

IBM Launches New Software to Break Down Data Silos and Streamline Planning and Analytics

IBM (NYSE: IBM) has announced new software designed to help enterprises break down data and analytics silos so they can make data-driven decisions quickly and navigate unpredictable disruptions. IBM Business Analytics Enterprise is a suite of business intelligence planning, budgeting, reporting, forecasting, and dashboard capabilities that provides users with a robust view of data sources across their entire business. Along with IBM Planning Analytics with Watson and IBM Cognos Analytics with Watson, this suite also includes a new IBM Analytics Content Hub that helps streamline how users discover and access analytics and planning tools from multiple vendors in a single, personalized dashboard view.

Businesses today are trying to become more data-driven than ever as they navigate the unexpected in the face of supply chain disruptions, labor and skills shortages and regulatory changes,” said Dinesh Nirmal, General Manager of Data, AI and Automation, IBM. “But to truly be data-driven, organizations need to be able to provide their different teams with more comprehensive access to analytics tools and a more complete picture of their business data, without jeopardizing their compliance, security or privacy programs. IBM Business Analytics Enterprise offers a way to bring together analytics tools in a single view, regardless of which vendor it comes from or where the data resides.

UnifabriX Blasts Out of Stealth Mode As Industry’s First Performance-Focused CXL System

UnifabriX, the company offering a memory pooling system for memory tiering and disaggregation, is announcing that it is exiting stealth mode to reveal the full scope of its unique CXL product. UnifabriX’s CXL silicone box hardware offers HPC data centers a lifeline for overcoming the computing-memory gap, which unlocks significant performance gains. With low latency, high-speed memory pooling, and extra memory bandwidth, UnifabriX is bringing the computing industry an entirely novel way to structure data centers and solve performance challenges.

In recent years, memory technology improvements have scaled at a much slower pace than processors. While server processor core counts grow from 33% to 50% on a yearly cadence, memory channel bandwidth has grown considerably slower. This mismatch in processor and memory improvements is even more acute when it comes to HPC, AI, and big-data workloads, which require special attention and further innovation by data centers to be able to facilitate high-level performance. UnifabriX’s CXL solution aims to solve the issue by enabling data centers to unlock the full speed, density, and scale of their infrastructure while achieving greater elasticity, scalability, and pooling. 

Setting out to achieve low latency is a difficult task that UnifabriX has taken on in its stride. We are setting out to prove that it is possible to break the laws of physics and increase latency.” Says Ronen Hyatt, Co-Founder, and CEO of UnifabriX. “UnifabriX’s CXL solution will exceed HPC’s performance demands, changing the design of data centers to address the global, growing high-pressure demands for more memory. UnifabriX is excited to exit stealth mode with a strong, all-encompassing and tested solution that overcomes a multitude of current HPC and data center problems.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter:

Join us on LinkedIn:

Join us on Facebook:

Speak Your Mind