Sign up for our newsletter and get the latest big data news and analysis.

Supporting Decentralized IoT Data with Software-defined Storage

Rob Whitleley, HedvigIn this special guest feature, Rob Whiteley is the VP of Marketing at Hedvig, discusses the challenges businesses face with IoT and data storage such as figuring out where all of the IoT data will go and how it will be stored and protected. Rob joins Hedvig from Riverbed and Forrester Research where he held a series of marketing and product leadership roles. Rob graduated from Tufts University with a BS in Computer Engineering.

We’re on the precipice of another computing evolution.

I know that sounds trite, but the evidence is overwhelming. As we enter the tail end of this decade, we’ll see another 10x jump in the number of connected devices. These jumps have always been catalyzed by a shift in computing architecture. Consider this data compiled by Morgan Stanley:

  • 1960s – Computing evolution: Mainframes. Total connected devices: 1M.
  • 1980s – Computing evolution: Minicomputers. Total connected devices: 10M.
  • 1990s – Computing evolution: Personal computers. Total connected devices: 100M.
  • 2000s – Computing evolution: Desktop internet. Total connected devices: 1B.
  • 2010s – Computing evolution: Mobile internet. Total connected devices: 10B.
  • 2020s – Computing evolution: Internet of Things. Total connected devices: 100B.

What does the explosion in Internet of Things (IoT) devices mean for my IT architecture?

First, it means the IT pendulum is swinging away from centralized to decentralized. Mainframes were the original, centralized compute model. The march towards desktops pushed us to a decentralized model. Then cloud came along and centralized compute into a relatively small number of megadatacenters run by internet goliaths like Amazon, Google and Microsoft  But the drive toward IoT will push us to decentralized again. Let’s examine why.

Centralization favors cost, decentralization favors agility

The constant IT pendulum boils down to cost. Centralizing allows businesses to reduce resources, gain efficiencies, and consequently lower cost. However, new business requirements put opposing pressure to go to market faster, engage new customers, and enter new markets. As a result, the business then invests in new technology models that are inherently decentralized because they need the agility. The costs add up, the business wonders why it’s spending so much, IT centralizes the resources, and the pendulum swings back.

IoT is on one of these trajectories. It’s a business-led initiative that uses internet-connected sensors to improve business insights, analyze customer sentiments, or streamline business processes. It pushes compute and storage “to the edge” where they can better support the gathering, storing, and analyzing of this IoT sensor data.

IoT decentralization requires a distributed systems architecture

IoT breaks down to two categories:

  • Passive: Monitoring. Many IoT devices are simply sensor-enabled and passively gather data. The data gathered may be in real time but the analyzing is not. Many RFID and consumer devices fall into this category.
  • Active: Sensing and reacting. The second category is all about real time. Here data is gathered from sensors for immediate processing and reaction. Many industrial, smart infrastructure, and machine-to-machine devices fall into this category.

The second category is far more challenging when it comes to architecting IT systems. I’ll leave networking out of it for a moment, because that’s a constant. Regardless of the type of IoT category, you’ll need network connectivity for all of the IoT endpoints. But compute and storage need to be physically located near the IoT sensors when it comes to active, real-time IoT applications; they can be centralized for passive environments.

For many, compute has not been as big an issue. Ongoing virtualization and containerization provide a software-defined compute layer that leverages ongoing innovations in Intel and ARM processors. Compute is also “stateless,” meaning it’s easier to move and decentralize. This makes it easy for me to locate compute near the sensors for processing real-time data.

Storage is a different case. Even with advancements in flash such as 3D NAND and NVMe, the relative cost of storing data is still very high. As a result, decentralizing it can be very expensive. It also creates “islands” of sensor data, making it difficult to gather global intelligence across all the sensor data. Finally, the business value lies in the data, so not only do I have my primary IoT storage, but I may need to protect it with two or three additional copies. That means I need to architect for IoT backups and archiving.

Software-defined storage provides a distributed IoT storage solution

Like compute, we’ve seen innovations in the software-defined storage arena. Newer systems are built as distributed systems. These modern storage platforms provide:

  • Data locality. I can deploy storage local to the site where my compute is processing sensor data. However, the data is replicated, protected, and stored as one global, virtual pool.
  • Collapsed tiering. Software-defined storage can provide primary storage for IoT data, or through different policies and hardware configurations, and be deployed for secondary IoT data backups and archiving.
  • Centralized control. Distributed systems are inherently decentralized when it comes to storage, but with a centralized “control plane” for management.
  • Better acquisition costs. Because the systems can be deployed on commodity servers, they also leverage the latest in hardware innovations and provide an acquisition cost that is 60 percent less expensive than traditional storage solutions.
  • Lower operating costs. Because these solutions can be API-driven and automated, they provide similar savings on the operational side when compared to islands of direct-attached storage at each compute site.

As you look to invest in IoT, make sure your underlying compute and storage are designed to be distributed systems. This means the intelligence will live in software, decoupled from the underlying hardware. This provides the benefits of a centralized solution but with the locality, tiering, control, and economics needed for decentralized IoT data.

 

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: