Sign up for our newsletter and get the latest big data news and analysis.

Better, Faster Graph Processing

A team from MIT CSAIL has developed a new programming language for graph processing that could help. Dubbed “GraphIt,” the new domain-specific language has been shown to outperform existing frameworks by a factor of nearly 5x while also reducing the lines of code by almost an entire order of magnitude.

Big Data Made Simple

The animation below from our friends over at WHISHWORKS explains in simple terms what is Big Data and when it’s time for a company to consider moving to a Big Data environment.

Data Science at Microsoft – Interviews with Practitioners

In this technical brief I wanted to pass along some great resources in support how data scientists approach their profession and illustrate the kind of background a typical data scientist might have to become successful. insideBIGDATA previously featured four compelling podcast interviews with Microsoft data scientists.

State of the Art Natural Language Processing at Scale

The two part presentation below from the Spark+AI Summit 2018 is a deep dive into key design choices made in the NLP library for Apache Spark. The library natively extends the Spark ML pipeline API’s which enables zero-copy, distributed, combined NLP, ML & DL pipelines, leveraging all of Spark’s built-in optimizations.

Building Neural Network Models That Can Reason

In this lecture, Christopher Manning, Thomas M. Siebel Professor in Machine Learning and Professor of Linguistics and of Computer Science, at Stanford University presents: “Building Neural Network Models That Can Reason.” Deep learning has had enormous success on perceptual tasks but still struggles in providing a model for inference. To address this gap, the presenter has been developing Memory-Attention-Composition networks (MACnets).

Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms

Stochastic Approximation algorithms are used to approximate solutions to fixed point equations that involve expectations of functions with respect to possibly unknown distributions. The most famous examples today are TD- and Q-learning algorithms. This three hour tutorial lecture series, courtesy of the Simon Institute for the Theory of Computing at UC Berkeley, consists of two video segments.

GTC Interview: AI Ready Solutions from Dell EMC

In this video from the GPU Technology Conference, Kash Shaikh from Dell EMC describes the company’s new AI Ready Solutions. “Dell EMC is at the forefront of AI, providing the technology that makes tomorrow possible, today. Dell EMC uniquely provides an extensive portfolio of technologies — spanning workstations, servers, networking, storage, software and services — to create the high-performance computing and data analytics solutions that underpin successful AI, machine and deep learning implementations.”

The Future of Cognitive Computing

In the video presentation below, Dr. John Kelly III, Senior Vice President, IBM Research and Solutions Portfolio, discusses the future of cognitive computing. Dr. Kelly is focused on the company’s investments in several of the fastest-growing and most strategic parts of the information technology market, including IBM Watson.

Big Data Meets HPC – Exploiting HPC Technologies for Accelerating Big Data Processing

DK Panda from Ohio State University gave this talk at the Stanford HPC Conference. “This talk will provide an overview of challenges in accelerating Hadoop, Spark and Memcached on modern HPC clusters. An overview of RDMA-based designs for Hadoop (HDFS, MapReduce, RPC and HBase), Spark, Memcached, Swift, and Kafka using native RDMA support for InfiniBand and RoCE will be presented.”

Analytics Development Life Cycle: Pangea is Panacea

Sai Prakash from HCL America gave this talk at the Stanford HPC Conference. “In this short talk we shall present an analytics workbench perspective (Pangea) that brings entire ADLC under single umbrella thus enabling collaboration, shrinking overall cycle time, easing model deployment efforts and allowing model monitoring. Actionable insights and visualizations are facilitated though service integration interfaces.”