Sign up for our newsletter and get the latest big data news and analysis.

Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot

A new research paper shows that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models.

AI from a Psychologist’s Point of View

Researchers at the Max Planck Institute for Biological Cybernetics in Tübingen have examined the general intelligence of the language model GPT-3, a powerful AI tool. Using psychological tests, they studied competencies such as causal reasoning and deliberation, and compared the results with the abilities of humans. Their findings, in the paper “Using cognitive psychology to understand GPT-3” paint a heterogeneous picture: while GPT-3 can keep up with humans in some areas, it falls behind in others, probably due to a lack of interaction with the real world.

Heard on the Street – 3/1/2023

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace.

“Above the Trend Line” – Your Industry Rumor Central for 2/28/2023

Above the Trend Line: your industry rumor central is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items grouped by category such as M&A activity, people movements, funding news, financial results, industry alignments, customer wins, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

Data Science 101: The Data Science Process

Welcome to insideBIGDATA’s Data Science 101 channel brining you perspectives for the topics of the day in data science, machine learning, AI and deep learning. Many of the video presentations come from my lectures for my Introduction to Data Science class I teach at UCLA Extension. In today’s slide-based video presentation I discuss The Data Science Process, an overview of the steps that data scientists use solving problems with data science and machine learning technologies.

Research Highlights: A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

The Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A pretrained foundation model, such as BERT, GPT-3, MAE, DALLE-E, and ChatGPT, is trained on large-scale data which provides a reasonable parameter initialization for a wide range of downstream applications.

Book Review: Tree-based Methods for Statistical Learning in R

Here’s a new title that is a “must have” for any data scientist who uses the R language. It’s a wonderful learning resource for tree-based techniques in statistical learning, one that’s become my go-to text when I find the need to do a deep dive into various ML topic areas for my work. The methods […]

Heard on the Street – 2/21/2023

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace.

Research Highlights: MIT Develops First Generative Model for Anomaly Detection that Combines both Reconstruction-based and Prediction-based Models

Kalyan Veeramachaneni and his team at the MIT Data-to-AI (DAI) Lab have developed the first generative model, the AutoEncoder with Regression (AER) for time series anomaly detection, that combines both reconstruction-based and prediction-based models. They’ve been building it for three years—AER has been learning and extracting intelligence for signals and has reached maturity to outperform the market’s leading models significantly.

AI and Big Data Expo North America Tickets are Now Live 

Plan to attend the AI and Big Data Expo North America, May 17-18, 2023 in the heart of Silicon Valley at the San Jose Convention Center. This in-person event, delivering AI & Big Data for a smarter future.