Sign up for our newsletter and get the latest big data news and analysis.

“Above the Trend Line” – Your Industry Rumor Central for 11/19/2021

Above the Trend Line: your industry rumor central is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items grouped by category such as M&A activity, people movements, funding news, financial results, industry alignments, customer wins, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

2022 State of Data Engineering: Emerging Challenges with Data Security & Quality

The 2022 Data Engineering Survey, from our friends over at Immuta, examined the changing landscape of data engineering and operations challenges, tools, and opportunities. The modern data engineering technology market is dynamic, driven by the tectonic shift from on-premise databases and BI tools to modern, cloud-based data platforms built on lakehouse architectures.

Book Review: Synthetic Data for Deep Learning

“Synthetic Data for Deep Learning,” by Sergey I. Nikolenko (published by Springer), represents a very good academic treatment of the subject. But what gives the book more street cred is the fact that the author is also Chief Research Officer for Synthesis AI, a start-up company pioneering this accelerating field. It’s nice to know the book represents both the academic and practical perspectives of the topic.

Improving Your Odds of ML Success with MLOps

In this special guest feature, Harish Doddi, CEO, Datatron, discusses what CEOs need to understand about using MLOps. He also shares insights on how to use MLOps to gain competitive advantage and provide tips on how to implement it.

Why the World’s Biggest Brands Are Betting Big on Graph

In this contributed article, Todd Blaschka, Chief Operating Officer at leading graph analytics platform TigerGraph, discusses how graph has evolved to support digital transformation, AI, and machine learning — and it’s become a major competitive differentiator among the world’s leading companies. Organizations in virtually every industry — from financial services and healthcare to retail and manufacturing — use graph to understand their customers, reduce fraud risk, and optimize their global supply chains.

eBook: 101 Ways to Use Third-Party Data to Make Smarter Decisions

To guide you in becoming a data-driven organization, AWS Data Exchange has created a new eBook, 101 Ways to Use Third-Party Data to Make Smarter Decisions. This innovative resource is designed as a broad compilation of use cases submitted by AWS Marketplace data providers.

Yandex Upgrades Open-source Machine Learning Library CatBoost

Yandex, a technology company that builds intelligent products and services powered by machine learning, announced that CatBoost 1.0.0, a major version of their open-source machine learning library. The new version goes far beyond a run-of-the-mill upgrade and is the culmination of four years of work by the Yandex Team.  

5 Easy Steps to Make Your Data Business Ready

In this contributed article, Ayush Parashar Vice President of Engineering at Boomi, discusses five core components to a strong data strategy so businesses can derive insights from and act on the data. As the uses for data continue to grow, businesses must ensure their data is actually usable.

AI, Healthcare, and C-3PO

In this special guest feature, David Sellars, CEng, Principal, Product Innovations at DrFirst, discusses how artificial intelligence has been lauded as the coming revolution in healthcare. But if you ask a clinician in your local hospital or pharmacy if they use AI in their regular workflows, you are likely to find that AI is curiously absent.

2022 Trends in Data Science: Newfound Ease and Accessibility

In this contributed article, editorial consultant Jelani Harper discusses some important trends for the next year in terms of how 2022 will embrace data science, the convergence of supervised and unsupervised learning, and structuring unstructured data, for newfound ease and accessibility. Data science has always been characterized by innovation and opportunity.