Sign up for our newsletter and get the latest big data news and analysis.

Models for Thinking: An Example of Why Data Sciences Increasingly Need the Humanities

Parsing such large-scale data sets – classifying genomic sequences, mapping forms of advertisement, observing online discussions, etc. – is a matter of organization: How do you make sense of, and classify, these clusters of information? The answer, often, is to configure them into abstract but coherent topics.

Why You Need a Modern Infrastructure to Accelerate AI and ML Workloads

Recent years have seen a boom in the generation of data from a variety of sources: connected devices, IoT, analytics, healthcare, smartphones, and much more. This data management problem is particularly acute in the areas of Artificial Intelligence (AI) and Machine Learning (ML) workloads. This guest article from WekaIO highlights why focusing on optimizing infrastructure can spur machine learning workloads and AI success.

Using Converged HPC Clusters to Combine HPC, AI, and HPDA Workloads

Many organizations follow an old trend to adopt AI and HPDA as distinct entities which leads to underutilization of their clusters. To avoid this, clusters can be converged to save (or potentially eliminate) capital expenditures and reduce OPEX costs. This sponsored post from Intel’s Esther Baldwin, AI Strategist, explores how organizations are using converged HPC to combine HPC, AI, and HPDA workloads.

Machine Learning Beyond Predefined Recipes

The next evolution in human intelligence is automating the creation of machine learning models to not follow predefined formulas, but rather adapt and evolve according to the problem’s data. While machine learning has enabled massive advancements across industries, it requires significant development and maintenance efforts from data science teams. Enter Darwin, a machine learning tool that automates the building and deployment of models at scale.

Building Fast Data Compression Code for Cloud and Edge Applications

Finding efficient ways to compress and decompress data is more important than ever. Compressed data takes up less space and requires less time and network bandwidth to transfer. In this article, we’ll discuss the data compression functions and the latest improvements in the Intel® Integrated Performance Primitives (Intel® IPP) library.

Solutions for Autonomous Driving – From Car to Cloud

From car to cloud―and the connectivity in between―there is a need for automated driving solutions that include high-performance platforms, software development tools, and robust technologies for the data center. With Intel GO automotive driving solutions, Intel brings its deep expertise in computing, connectivity, and the cloud to the automotive industry.

The Importance of Vectorization Resurfaces

Vectorization offers potential speedups in codes with significant array-based computations—speedups that amplify the improved performance obtained through higher-level, parallel computations using threads and distributed execution on clusters. Key features for vectorization include tunable array sizes to reflect various processor cache and instruction capabilities and stride-1 accesses within inner loops.

Case Study: More Efficient Numerical Simulation in Astrophysics

Novosibirsk State University is one of the major research and educational centers in Russia and one of the largest universities in Siberia. When researchers at the University were looking to develop and optimize a software tool for numerical simulation of magnetohydrodynamics (MHD) problems with hydrogen ionization —part of an astrophysical objects simulation (AstroPhi) project—they needed to optimize the tool’s performance on Intel® Xeon Phi™ processor-based hardware.

Big Data Project Failure Pain Points and their Solution

Big data projects don’t typically fail for a single reason, and certainly not for technology alone. A combination of factors serve to derail big data deployments. Problems and failures occur due to factors including business strategy, people, culture, inattention to analytics details or the nuances of implemented tools, all intensified by the rapid advancement of digital transformation.

Moving Big Data into the Cloud

While there are multiple architectures targeting big data analytics, we have traditionally focused on maximizing dedicated hardware and large processing power. In the past few years, however, there is a fast growing convergence of big data and cloud, particularly where the data sets are unstructured with simple data models — an area of specific focus for the Apache Hadoop technology.