Cask, the company that makes building and running big data solutions easy, announced a public preview release of CDAP 4, the first unified integration platform for big data. Among a series of important enhancements, CDAP 4 introduces the Cask Market, a new “big data app store” which enables developers, data scientists and citizen integrators to quickly build and deploy applications, data pipelines, plug-ins, and use case recipes on Hadoop and Spark with the click of a button. CDAP 4 Preview, along with updates to its self-service extensions, Cask Hydrator and Cask Tracker, is 100% open source and will be available for download on the Cask website.
Nearly five years after our founding, and nine years after my initial foray into Hadoop, CDAP 4 represents everything the Cask team has learned working closely with our customers, our partners, and the community,” said Jonathan Gray, founder and CEO at Cask. “We have always been focused on making big data easier, on letting users focus more on the fun and productive stuff and taking away as much of the pain and plumbing as possible. CDAP 4 delivers on the original vision we had of a big data app store, but our customers helped us figure out what that actually meant.”
Previous versions of the Cask Data Application Platform (CDAP) initially focused on big data application management, driving efforts to standardize and pre-integrate Hadoop infrastructure, while providing metrics and logs as well as a complete testing and debugging environment for distributed applications. In subsequent releases, the need for data and process consistency across environments led to the design of a dedicated application and data integration solution, adding data ingestion, data pipelines as well as workflows and metadata to CDAP. CDAP 4 is a truly unified integration platform for big data, which enables enterprise IT to deliver a well-governed, self-service data environment for citizen integrators and line of business users, significantly accelerating time to value from Hadoop.
New features in CDAP 4 include:
- Cask Market, “Big Data App Store”: Cask Market is the first 100% open source app store for big data. By letting users download pre-built Hadoop solutions and reusable templates, they can get started and be productive with Hadoop and Spark in minutes rather than days or weeks. Users can access Cask Market from anywhere in the CDAP UI and instantly deploy applications, such as Customer 360 and Network Analytics, pre-built pipelines, such as S3 to HDFS and ADLS to HDFS, and plug-ins, such as a Postgres Driver and a Cassandra Driver.
- New, reimagined CDAP user interface: CDAP 4 introduces a completely new, expanded and redesigned user interface built with React.
- Data wrangling: Users can build and modify schemas dynamically through an interactive user interface operating against real data, such as parsing CSV files.
- Platform enhancements for production: Expanded lineage capabilities to capture schema changes and upgrades over time, improved control and scalability for transactions, application and service versioning and upgrades, and a reliable, dedicated high-performance platform-level messaging service.
- More plug-ins in Cask Hydrator: HBase Export Source, Oracle Dump Source, DB2 Dump Source, Netezza Dump Source, AWS Redshift and Kinesis, and more.
Getting Hadoop into production and deriving business value has been a slow and painful process in the past, largely due to the proliferation of projects and APIs, the divergence of distributions and technologies, and the integration silos created by the few tools that exist,” said Holger Mueller, Principal Analyst and VP, Constellation Research. “A faster path for building next generation applications on top of Hadoop is when vendors provide pre-built applications and pipelines for many of the common use cases we see, such as Customer 360 and EDW Offloading, promising to drastically reduce the time it takes to get projects off the ground.”
Building a successful enterprise-class big data solution that solves today’s business challenges requires IT teams to integrate and process various data sources and types and to build and deploy complex distributed applications. In addition, there is added sophistication required to enable a secured and governed environment empowering data stewards and ad-hoc integrators to quickly extract insights from their data. CDAP 4 is the first unified integration platform for big data that offers a distributed application framework, modern data integration, self-service access to data and comprehensive security and governance. This is all delivered in one, easy to adopt, open source, enterprise-ready solution that works seamlessly with all major Hadoop distributions, both on-premises and in the cloud.
CDAP removes barriers to innovation by enabling enterprise IT to quickly build and run modern data applications and data lakes, democratize secure access to the data and accelerate time to value for businesses, said Nik Rouda, Senior Analyst, ESG. “This means enterprises can start using more of their data in very real ways to boost their bottom line much faster.”
Sign up for the free insideBIGDATA newsletter.