SnapLogic Democratizes Apache Kafka and Streaming Data Integration with Latest Release

Print Friendly, PDF & Email

SnapLogic_logoSnapLogic, the unified data and application integration platform as a service (iPaaS), introduced the Spring 2016 release of its SnapLogic Elastic Integration Platform. The Spring release adds new capabilities for integrating streaming data and powering big data analytics in the cloud with support for Apache Kafka, Microsoft HDInsight, and Google Cloud Storage, plus numerous enhancements that automate data shaping and management tasks that are critical to transforming data into insights.

SnapLogic’s platform is now processing over 100 billion JSON documents per month, delivering enterprise-scale data and application integration as a service to our customers,” said Vaikom Krishnan, vice president of engineering at SnapLogic. “The Spring 2016 release further expands our big data integration capabilities with advanced streaming capabilities that are well suited for Internet of Things and data lake use cases.”

Self-Service Integration for Streaming Data

Much of the data flowing into enterprise data lakes is high-throughput, real-time data from e-commerce transactions, website clickstreams, wearables and other Internet of Things sources. SnapLogic’s new intelligent connectors, called Snaps, for the Apache Kafka message broker:

  • make it simple to create low-latency big data pipelines without coding,
  • help to make Kafka enterprise-ready with pre-built Snaps for common data transformation operations plus connectors for 400+ endpoints,
  • can be used in conjunction with SnapLogic Ultra Pipelines, always-on data flows which receive input from a website or an application and return data to the requested endpoint at speeds up to 10x faster, making them ideal for IoT data flows.

SnapLogic’s move to add support for Apache Kafka should provide value to its customers given the increasing demand for streaming data,” said Matt Aslett, research director, Data Platforms and Analytics at 451 Research. “The ability to create low-latency pipelines without coding, combined with pre-built Snaps for common transformations should lower the barriers to ingesting streaming data from Kafka.”
Powering Analytics and Data Management in the Cloud

SnapLogic continues to build on its momentum in data management for both on-premises and cloud-based data lakes with new support for HDInsight, Microsoft’s cloud service for Hadoop and Spark. With SnapLogic users can, without scripting, ingest data from virtually any source to an HDInsight cluster, and prepare and deliver timely and relevant data for analysis to business intelligence tools or off-cluster data stores.

SnapLogic’s flexibility to support any data, anywhere is strengthened in the Spring release with new support for Google Cloud Storage, which complements SnapLogic’s Snap for Google BigQuery.

Automating Data Preparation and Shaping

The SnapLogic Elastic Integration Platform Designer enables users to operationalize many of the data quality, preparation and transformation tasks required for analysis through automated tasks within visual data flow pipelines. Enhancements include the following.

  • Data Mapping Improvements: SmartLink helps simplify the mapping of data by suggesting field-to-field mapping. With this release, SmartLink has been updated with the ability to select between multiple algorithm options, including exact, case insensitive, fuzzy and history matching.
  • New Transformation Snaps for Spark: New Spark-compatible Snaps for common data preparation and shaping tasks include Join, JSON Parser, JSON Splitter, JSON Formatter, and Parquet Reader and Writer.

Integration Governance and Automation

The Spring 2016 release also brings governance enhancements to the SnapLogic platform for improved management, control, and flexibility for enterprise environments. They include the new features listed below.

  • Snap Versioning: Greater flexibility for administrators to plan upgrades to their pipelines according to their business needs.
  • Platform Metadata Snap: Exposes SnapLogic platform metadata in a way that common activities such as creating, deleting and mass updating of elements such as accounts and tasks can be automated as part of a pipeline.
  • Pipeline Execute Snap: Spawns child pipelines that execute repetitive tasks in parallel, making pipeline execution many times faster.
  • Project Spaces: Groups of pipelines and tasks that allow administrators to organize complex environments and provide fine-grained access controls to all of these elements.

We are excited about the latest platform update for SnapLogic. The rich integration platform is being extended in this release to support more efficient DevOps and more logical project component organization,” said Jim Teal, senior cloud architect, iRobot. “This will help us to do more, and do it more efficiently with SnapLogic.”

From the Snap Labs: Containerized Integration Preview

SnapLogic has added a new capability currently under development that will “containerize” hybrid cloud and big data integration. Currently available via a customer preview program, this capability will allow customers to deploy a just-in-time Snaplex — the elastically-scalable data processing component of the SnapLogic platform — via a Docker container. These Snaplex containers can be deployed in any cloud environment that can host Docker containers, and can run in data centers running Docker Swarm, Kubernetes or Mesos. Using these containers it will be easy and quick to deploy and take down entire Snaplex clusters that efficiently utilize servers.

Expanded Library of Snaps and Updated Certifications

In addition to the new Kafka Snap, updates to the Spring 2016 release include a new Snap for Microsoft Azure SQL Bulk Load, and significant improvements to Snaps for Anaplan and NetSuite.

With the Spring 2016 release, the SnapLogic platform has also been updated to support Cloudera CDH 5.5 and Hortonworks 2.3.4.


Sign up for the free insideBIGDATA newsletter.


Speak Your Mind