Sign up for our newsletter and get the latest big data news and analysis.

What to Avoid When Solving Multilabel Classification Problems

In this contributed article, April Miller, a senior IT and cybersecurity writer for ReHack Magazine, suggests that If you are working with a model with a multilabel classification problem, there is a likely chance you will run into something in need of fixing. Here are a few common issues you may encounter and what to avoid when solving them.

Research Highlights: R&R: Metric-guided Adversarial Sentence Generation

Large language models are a hot topic in AI research right now. But there’s a hotter, more significant problem looming: we might run out of data to train them on … as early as 2026. Kalyan Veeramachaneni and the team at MIT Data-to-AI Lab may have found the solution: in their new paper on Rewrite and Rollback (“R&R: Metric-Guided Adversarial Sentence Generation”), an R&R framework can tweak and turn low-quality (from sources like Twitter and 4Chan) into high-quality data (texts from sources like Wikipedia and industry websites) by rewriting meaningful sentences and thereby adding to the amount of the right type of data to test and train language models on.

The Key Role Missing in Most Data Science Teams

In this contributed article, Wendy Lynch, Founder of, shares her experience of working with small to large global clients on how to break down the communication barriers in an organization to deliver results. This often happens between the analyst teams and the business teams.

Stop Building Models, Start Training Data

In this special guest feature, Sanjay Pichaiah, VP of Product Growth at Akridata, highlights why it is time for data scientists to stop building models and start training data. The path to better models and greater model accuracy doesn’t lie exclusively with the model, even though that has been the greatest focus in recent years. To truly accelerate and increase model performance, we need to be focusing more on the training data sets we are supplying the models and stop hoping the data is good enough.

Heard on the Street – 11/29/2022

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace.

Chung-Ang University Researchers Develop Algorithm for Optimal Decision Making under Heavy-tailed Noisy Rewards

Researchers from South Korean Chung-Ang University propose methods that theoretically guarantee minimal loss for worst case scenarios with minimal prior information for heavy-tailed reward distributions.

The Ultimate Guide for Computer Vision Deployment on NVIDIA Jetson

Our friends from the Deci Team offer “The Ultimate Guide for Computer Vision Deployment on NVIDIA Jetson” which is perfect if you’re running or planning to run computer vision applications on NVIDIA Jetson devices. Written by Deci’s deep learning engineers for deep learning engineers, here’s what you’ll learn: (i) Best practices and technical know-how on model selection, training, optimization and deployment on Jetson, (ii) Tips for selecting the right NVIDIA Jetson for your use case, (iii) Code, tools, further resources for every recommendation provided

insideBIGDATA Latest News – 11/21/2022

In this regular column, we’ll bring you all the latest industry news centered around our main topics of focus: big data, data science, machine learning, AI, and deep learning. Our industry is constantly accelerating with new products and services being announced everyday. Fortunately, we’re in close touch with vendors from this vast ecosystem, so we’re in a unique position to inform you about all that’s new and exciting. Our massive industry database is growing all the time so stay tuned for the latest news items describing technology that may make you and your organization more competitive.

Machine Learning Career Path: Exploring Opportunities in 2022 and Beyond

In this special guest feature, George Tsagas, Owner of eMathZone, discusses how machine learning professionals can work as data scientists, computer engineers, robotics engineers, or managers. But if you want to make a career, the first step in finding opportunities in the field of machine learning is to understand the different types of jobs and skills needed.

“Above the Trend Line” – Your Industry Rumor Central for 11/16/2022

Above the Trend Line: your industry rumor central is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items grouped by category such as M&A activity, people movements, funding news, financial results, industry alignments, customer wins, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.