Elasticsearch, Inc., the company on a mission to make data useful to businesses by delivering an advanced search and analytics engine, has announced the 2.0 release of its Hadoop connector, Elasticsearch for Apache Hadoop, along with certification on Cloudera Enterprise 5. With Cloudera certification, Elasticsearch is now compatible across all Apache-based Hadoop distributions, including HortonWorks and MapR, helping businesses extract immediate insights regardless of where their hundreds of terabytes or even petabytes of data are stored.
Elasticsearch is the search and analytics engine behind the ELK stack, which also utilizes Logstash, a log management tool, and Kibana’s powerful data visualization capabilities to help businesses pull vital information from their data stores. When used in conjunction with Hadoop, organizations no longer need to run a batch process and wait hours to analyze their data – Elasticsearch for Apache Hadoop can pipe data to Elasticsearch for indexing as it’s being generated, making it available for search and analysis in a matter of seconds. Kibana can also be used to easily explore massive amounts of data in Elasticsearch through easy-to-generate pie charts, bar graphs, scatter plots, histograms, and more.
Hadoop was created to store and archive data at a massive scale, but businesses need to be able to ask, iterate, and extract actionable insights from this data – which is what we designed our products for,” commented Steven Schuurman, co-founder and CEO, Elasticsearch. “With today’s certification from Cloudera, Elasticsearch now works with all Apache-based Hadoop distributions, and with it, solves the last mile of big data Hadoop deployments by getting big insights, fast.”
How Businesses Leverage Elasticsearch and Hadoop
Elasticsearch is becoming the critical piece of pulling data from any environment and getting it into the hands of developers, engineering leads, CTOs, and CIOs who need insight into moving parts of their business at the rate they are happening. Customer examples include:
- Klout, which connects the petabytes of data stored in a Hadoop Distributed File System on their 400 million+ users to Elasticsearch. Query results, used to build targeted marketing campaigns, are delivered in seconds rather than minutes.
- MutualMind, which enables customers like AT&T, Kraft, Nestle, and Starbucks to monitor their brands on social networks. After its Hadoop batches started taking 15+ minutes, MutualMind moved to Elasticsearch to power their real-time analytics, while utilizing Hadoop for statistical analysis.
- An international financial services that started using Elasticsearch to analyze their access logs in minutes so they didn’t have to wait hours to run MapReduce jobs; they were even able to expand the window of data they analyzed from an hour to a full week as Elasticsearch provided insights so quickly on large amounts of data.
Key Features of Elasticsearch for Apache Hadoop
- The ability to read and write data between Hadoop and Elasticsearch: Allows data to be written from Hadoop to Elasticsearch for real-time search, analytics . Jobs that would take minutes or hours can be read back to Hadoop in minutes.
- Native integration and support for popular Hadoop libraries: Lets users run queries natively on Hadoop through MapReduce, Hive, Pig, or Cascading APIs.
- Snapshot/Restore: Makes it easy to take a snapshot of data within Elasticsearch – perhaps a year’s worth – and archive it in Hadoop. At any time, the snapshot can be restored back to Elasticsearch for additional analysis.
Part of our mission at Cloudera is to support and promote an open architecture and allow customers to leverage their technology investments,” commended Tim Stevens, vice president of Business and Corporate Development at Cloudera. “Together, Cloudera and Elasticsearch provide businesses with a solution that allows them to get insight out of massive amounts of data.
Because Elasticsearch works across distributed, diverse environments, engineers can search, extract, clean up and analyze data whether it comes from log events, social media activity, support tickets, website analytics or product interactions. Thousands of businesses worldwide continue to adopt Elasticsearch to store, search and analyze any type of data in real time, including Bloomberg, Comcast, eBay, Facebook, GitHub, Mayo Clinic, McGraw-Hill, Netflix, The New York Times, Target, Verizon, WordPress and Yelp.
Sign up for the free insideBIGDATA newsletter.