New Syncsort Big Data Software Removes Barriers to Mainstream Apache Hadoop Adoption

Print Friendly, PDF & Email

Syncsort_logoStrata + Hadoop World News

Syncsort, a global leader in Big Data software, announced the biggest, most comprehensive release of its award-winning data integration product suite, including new technology that makes it easier for customers to adopt Apache Hadoop.

This release introduces a design approach that will support multiple compute frameworks, de-coupling the user experience from the underlying Hadoop processing paradigms, including MapReduce, Apache Spark, and Apache Tez. The unique architectural approach will “future-proof” the process of collecting, blending, transforming, and distributing data, providing a consistent user experience while still taking advantage of the powerful native performance of the rapidly evolving compute frameworks that run on Hadoop.

Our research shows that the most common workloads being shifted to Hadoop are large-scale data transformations,” said Jeff Kelly, Principal Research Contributor, the Wikibon Project. “Syncsort continues to make waves in the Big Data ecosystem by innovating easier, more effective ways to create these transformations in Hadoop and to move expensive enterprise data warehouse and mainframe workloads across emerging Hadoop frameworks.”

The release includes a new “Intelligent Execution Layer” that allows users to visually design data transformations once and then run them anywhere – across Hadoop, Linux, Windows, or Unix, on premise or the cloud – while maintaining the performance of a native implementation. The architecture includes dozens of special-purpose algorithms and an advanced optimizer that automatically selects the ideal execution path to process jobs based on the underlying compute frameworks available, the characteristics of the data set, and Hadoop cluster conditions.

Many organizations are looking to liberate their legacy data and budgets by shifting workloads from enterprise data warehouses and mainframes into Hadoop, but they are challenged by the complexities of the rapidly-improving Hadoop stack,” said Tendu Yogurtcu, General Manager of Syncsort’s Big Data business. “This new release makes it extremely easy to shift sophisticated data flows to Hadoop and to create new data transformations, taking advantage of the Hadoop -powered compute paradigms as they evolve.”

Highlights of the new release include the ability for users to:

  • Avoid application obsolescence by deploying and running the same highly efficient data flows on or off of Hadoop, on-premise, or in the cloud
  • Isolate the transformation logic from the underlying complexities of Hadoop using a new Intelligent Execution Layer that will allow Syncsort to deliver native support for multiple compute frameworks such as Apache Spark and Tez
  • Leverage best-in-class, one-step data ingestion capabilities for Hadoop – ingesting data directly into Big Data formats such as Avro and Parquet without the need for staging
  • Load Apache Spark engines with legacy mainframe data sets, including VSAM and binary sequential files with COBOL copybook metadata – all via a new, Cloudera certified, Apache Spark mainframe connector from Syncsort
  • Support governance initiatives with advanced metadata management and data lineage, including HCatalog support
  • Utilize best-in-class data discovery, analysis and visualization with the new Syncsort QlikView QVX Connector and the Tableau Data Extract Connector
  • Achieve high performance parallel data loads to Hive, Vertica, and Greenplum
  • Support NoSQL data stores such as Apache Cassandra, HBase and MongoDB
  • Monitor and manage Apache Hadoop transformations with customized dashboards based on operational metadata and RESTful APIs shipped in Docker containers

Experian, the largest credit bureau and a company that is focused on bringing data and insights together to help businesses and consumers alike, is one client who has adopted the product.

Experian continually evolves our big data integration solutions to bring more data into our analytics solutions for better business insights and to help our clients make better decisions, said Tom Thomas, IT senior director for Consumer Information Services at Experian. “The flexibility of platform offered by the new release of DMX-h has allowed us to move processes from months to hours, opening time up for the lead developers to research additional effective uses of our Hadoop environment, including moving additional products onto Hadoop.”

The new release will be generally available this month.

 

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind

*