Sign up for our newsletter and get the latest big data news and analysis.

dbt Labs Report – Opportunities and Challenges for Analytics Engineers

The practice of analytics engineering (made popular by dbt Labs) took the data world by storm last year as the novel approach changed how data professionals work. Today, dbt Labs officially launched the inaugural State of Analytics Engineering report. The report assessed the analytics engineering practice and gathered insights from those actively involved in the day-to-day of data. 

O’Reilly 2023 Tech Trends Report Reveals Growing Interest in Artificial Intelligence Topics, Driven by Generative AI Advancement

O’Reilly, a premier source for insight-driven learning on technology and business, announced the findings of its annual Technology Trends for 2023 report, which examines the most sought-after technology topics consumed by the 2.8 million users on O’Reilly’s online learning platform.

Research Highlights: Real or Fake Text? We Can Learn to Spot the Difference

A team of researchers at the University of Pennsylvania School of Engineering and Applied Science is seeking to empower tech users to mitigate risks of AI generated misinformation. In a peer-reviewed paper presented at the February 2023 meeting of the Association for the Advancement of Artificial Intelligence, the authors demonstrate that people can learn to spot the difference between machine-generated and human-written text.

ClearML Study: Friction a Key Challenge for MLOps Tools

ClearML, the open source, end-to-end MLOps platform, released the final set of data to complete its recently released research report, MLOps in 2023: What Does the Future Hold? Polling 200 U.S.-based machine learning decision makers, the report examines key trends, opportunities, and challenges in machine learning and MLOps.

Data Science Bows Before Prompt Engineering and Few Shot Learning 

In this contributed article, editorial consultant Jelani Harper takes a new look at the GPT phenomenon by exploring how prompt engineering (stores, databases) coupled with few shot learning can constitute a significant adjunct to traditional data science.

Research Highlights: SparseGPT: Prune LLMs Accurately in One-Shot

A new research paper shows that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models.

AI from a Psychologist’s Point of View

Researchers at the Max Planck Institute for Biological Cybernetics in Tübingen have examined the general intelligence of the language model GPT-3, a powerful AI tool. Using psychological tests, they studied competencies such as causal reasoning and deliberation, and compared the results with the abilities of humans. Their findings, in the paper “Using cognitive psychology to understand GPT-3” paint a heterogeneous picture: while GPT-3 can keep up with humans in some areas, it falls behind in others, probably due to a lack of interaction with the real world.

Research Highlights: A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

The Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A pretrained foundation model, such as BERT, GPT-3, MAE, DALLE-E, and ChatGPT, is trained on large-scale data which provides a reasonable parameter initialization for a wide range of downstream applications.

Google Cloud Unveils Its 2023 Data and AI Trends Report

Google Cloud worked with IDC on multiple studies involving global organizations across industries in order to explore how data leaders are successfully addressing key data and AI challenges. The company compiled the results in its 2023 Data and AI Trends report. In it, you’ll find the metrics-rich research behind the top five data and AI trends, along with tips and customer examples for incorporating them into your plans. 

Research Highlights: MIT Develops First Generative Model for Anomaly Detection that Combines both Reconstruction-based and Prediction-based Models

Kalyan Veeramachaneni and his team at the MIT Data-to-AI (DAI) Lab have developed the first generative model, the AutoEncoder with Regression (AER) for time series anomaly detection, that combines both reconstruction-based and prediction-based models. They’ve been building it for three years—AER has been learning and extracting intelligence for signals and has reached maturity to outperform the market’s leading models significantly.