Sign up for our newsletter and get the latest big data news and analysis.

Primary Motivators of Big Data vis-à-vis Scientific Research

BigData_science

This article is the second in an editorial series with a goal to provide a road map for scientific researchers wishing to capitalize on the rapid growth of big data technology for collecting, transforming, analyzing, and visualizing large scientific data sets.

insideBIGDATA Guide to Scientific Research

BigData_science

In this new insideBIGDATA Guide to Scientific Research, the goal is to provide a road map for scientific researchers wishing to capitalize on the rapid growth of big data technology for collecting, transforming, analyzing, and visualizing large scientific data sets.

DataFest Competition Brings Big Data to College Students

Data Science

Students from more than 20 prestigious colleges and universities recently tried their hand at “Big Data” analysis at seven different campuses around the country during DataFest, an annual month-long data-analytics competitive event sponsored by the American Statistics Association.

Statistics is the Fastest Growing Undergraduate STEM Degree

statistics-logo

Statistics—the science of learning from data—is the fastest-growing science, technology, engineering and math (STEM) undergraduate degree in the United States over the last four years, an analysis of federal government education data conducted by the American Statistical Association (ASA) revealed.

The Major Roadblocks Facing the Smart City

Cristian_Borcea

In this special guest feature, Cristian Borcear of NJIT reflects on the evolution of technology and public policy in support of so-called “smart cities. ” Cristian Borcea is an Associate Professor and the Associate Chair of the Department of Computer Science at New Jersey Institute of Technology.

Performance Optimization of Hadoop Using InfiniBand RDMA

DK Panda

“The Hadoop framework has become the most popular open-source solution for Big Data processing. Traditionally, Hadoop communication calls are implemented over sockets and do not deliver best performance on modern clusters with high-performance interconnects. This talk will examine opportunities and challenges in optimizing performance of Hadoop with Remote DMA (RDMA) support, as available with InfiniBand, RoCE (RDMA over Converged Enhanced Ethernet) and other modern interconnects.”

The userR!2014 Conference in Review

useR_HadleyTweet

FIELD REPORT Last week I attended the long-anticipated useR!2014 international conference at the UCLA campus, my alma mater. The four day event had something for everyone in attendance – all the brain cycles centered around the use of the R statistical environment. Since R is a primary tool for my work in data science and […]

Big Data Capacity and Performance Supports Genome Analytics

Bid-Data-2014.

When Stanislav Dusko Ehrlich – a world expert in microbiology and a pioneer of metagenomics – and his team set out to create their next generation biotech research platform, they needed a technology solution to support their stringent capacity and performance requirements for big data analytics.

New Market Dynamics Report: HPC Life Sciences

HPC Life Sciences

Scientific research in the life sciences is often akin to searching for needles in haystacks. Finding the one protein, chemical, or genome that behaves or responds in the way the scientist is looking for is the key to the discovery process. For decades, high performance computing (HPC) systems have accelerated this process, often by helping to identify and eliminate in feasible targets sooner.

DK Panda Presents: Big Data – Hadoop and Memcached

DK Panda

“As InfiniBand is getting used in scientific computing environments, there is a big demand to harness its benefits for enterprise environments for handling big data and analytics. This talk will focus on high-performance and scalable designs of Hadoop using native RDMA support of InfiniBand and RoCE. Designs for various components in Hadoop (such as HDFS, MapReduce, RPC, and HBASE) and their benefits based on the RDMA package for Apache Hadoop will be presented. RDMA-based design for scalable Memcached (used in Web 2.0) and the associated benefits will be presented.”