Caringo Announces Direct Massively Parallel, Compliant Storage for Hadoop

Print Friendly, PDF & Email

caringo-logoCaringo® recently announced the availability of Swarm HadoopFS, a native Hadoop 2+ connector for Caringo Swarm that saves time and resources with highly-efficient direct parallel map reduce processing, paired with compliance features such as WORM, integrity seals and Legal Hold.

Using Hadoop for data processing and analytics typically involves a time-consuming and resource-intensive bulk-load of a data from an archive or file server into the Hadoop FileSystem (HDFS). With Caringo’s direct approach, HDFS can read data directly from Swarm and, because of Swarm’s unique massively parallel approach where all nodes cooperate to perform all processes, each HDFS server can pull data in parallel. This eliminates the time-consuming extract and ingest step, resulting in faster time to the map reduce stage while reducing reliance on expensive NAS or filer storage in a Hadoop environment.

Additionally, organizations can use the standard compliance and data protection features in Swarm to ensure their data is safe, accessible and hasn’t been tampered with. Swarm supports the ability to store data so that it can’t be deleted (WORM); the ability to prove in a court of law content hasn’t been tampered with (Integrity Seals); and the ability to take a snapshot of data and store it immutably (Legal Hold). These features combined with Swarm’s ability to automatically manage the data lifecycle, moving from erasure coding or replication all on the same servers, make Swarm the best option for organizations that want to leverage Hadoop but have stringent regulatory requirements.

The ability to quickly analyze and act upon data is a key competitive advantage,” said Mark Goros, CEO of Caringo. “Organizations of every size understand this and have been deploying Hadoop clusters in a fragmented nature, often relying on HDFS with JBOD for long-term storage which it wasn’t designed for. With SwarmFS we enable resilient, compliant and highly efficient long-term storage for all unstructured data in a highly automated fashion. This includes data that you may not even know you want to analyze yet, all instantly accessible by Hadoop in a direct, massively parallel fashion.”

Caringo Swarm HadoopFS is available now. For more information visit


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind