When Stanislav Dusko Ehrlich – a world expert in microbiology and a pioneer of metagenomics – and his team set out to create their next generation biotech research platform, they needed a technology solution to support their stringent capacity and performance requirements for big data analytics.
Scientific research in the life sciences is often akin to searching for needles in haystacks. Finding the one protein, chemical, or genome that behaves or responds in the way the scientist is looking for is the key to the discovery process. For decades, high performance computing (HPC) systems have accelerated this process, often by helping to identify and eliminate in feasible targets sooner.
“As InfiniBand is getting used in scientific computing environments, there is a big demand to harness its benefits for enterprise environments for handling big data and analytics. This talk will focus on high-performance and scalable designs of Hadoop using native RDMA support of InfiniBand and RoCE. Designs for various components in Hadoop (such as HDFS, MapReduce, RPC, and HBASE) and their benefits based on the RDMA package for Apache Hadoop will be presented. RDMA-based design for scalable Memcached (used in Web 2.0) and the associated benefits will be presented.”
“Splunk Enterprise is a platform for machine data. The technology delivers powerful and fast analytics to quickly unlock the value of machine data to IT and other users throughout an organization. In short, it’s a simple, effective way to collect, analyze and secure the massive streams of machine data generated by all IT systems and technology infrastructure.”
“Intel’s goal is to encourage more innovative and creative uses for data as well as to demonstrate how big data and analytics technologies are impacting many facets of our daily lives, including sports. For example, coaches and their staffs are using real-time statistics to adjust games on-the-fly and throughout the season. From intelligent cameras to wearable sensors, a massive amount of data is being produced that, if analyzed in real-time, can provide a significant competitive advantage. Intel is among those making big data technologies more affordable, available, and easier to use for everything from helping develop new scientific discoveries and business models to even gaining the upper hand on good-natured predictions of sporting events.”
“Active archives are ideal for organizations that face exponential data growth or regularly manage high-volume unstructured data or digital assets. Target markets include life sciences, media and entertainment, education, research, government, financial services, oil and gas, and telecommunications, as well as general IT organizations requiring online data archive options.”
“Deep storage, and tape library-based storage in general, benefit organizations that are looking to incorporate low-cost, high-density, scalable storage into their fast-growth data environments. Industries that recognize the value and regularly rely on tape storage include education, federal and state government, finance, life sciences, media and entertainment, oil and gas exploration, and Web 2.0, among others.”
We sat down with Cristian Borcea, PhD from the New Jersey Institute of Technology to discuss the IoT and Big Data applications. “New machine learning techniques could help us extract knowledge from these data – this happens especially for knowledge that we don’t expect and we don’t even know exists – we cannot search for something that we don’t know exists.”
We caught up with Mike Boros, Hadoop Product Manager at Cray, to learn about the company’s Big Data solutions. “I think you’ll see Cray continue to focus on Big & Fast, vs. just Big Data. Technologies like Hadoop make hosting large data sets easy. The challenge of getting value from that data set, after it’s large, is what we’re interested in.”