Sign up for our newsletter and get the latest big data news and analysis.

Rethinking Messaging for the Era of Microservices and Cloud

In this special guest feature Matteo Merli, Streamlio co-founder, discusses what’s needed from a modern solution for connecting data in the era of cloud and microservices. Matteo Merli co-founded Streamlio and is an expert on distributed messaging solutions designed for demanding scale. Previously, he spent several years building database replication systems and multitenant messaging platforms at Yahoo. Matteo was the architect and lead developer for Apache Pulsar and is a PMC member of Apache BookKeeper. He holds a Bachelors of Science degree in computer science from the Università degli Studi di Parma.

Messaging technology has a long history, going back decades to early message queuing middleware developed to facilitate communication among mainframe and server applications. Since those origins, the use cases and technology for messaging have gone through many evolutions, as technologies as diverse as IBM MQ, RabbitMQ, ActiveMQ, NATS, Apache Kafka, and others have come onto the scene. Throughout, these evolutions have been driven by the same fundamental need–how to connect applications and data in a performance-oriented, scalable and reliable way.

However, more recently the growing role of cloud services, microservices architectures, and streaming data sources have led to a significant increase in the use cases for messaging technology, and added new requirements and expectations for those solutions. In particular, cloud services and microservices have driven an explosion in the number of components that both generate data and that need to be connected by data. The ever-increasing number of data sources that are available to applications, especially the number that are now available as streams of data accessible through APIs, has also increased demands on messaging software. The race to react to and act on data ever faster has only amplified the magnitude of these changes.

This requires a fundamental modernization of  messaging architectures, because legacy technology is failing to keep up with the demands created by these changes. For one, those technologies were simply not designed for the scale and speed of modern data. Not only is that true for solutions whose fundamental architectures date back decades, it’s even true for solutions developed for the Hadoop era. Legacy technologies simply were not designed for the flexibility and elasticity needed to support modern applications. They were instead designed for environments in which configuration and resource changes would be infrequent, and as a result they are typically cumbersome or disruptive to scale. That makes them challenging to deploy in cloud environments, where resources can be added (and removed) in just a few minutes or seconds, as well as in microservices environments, where the ability to quickly adapt to fluctuations in usage is critical to delivering consistent end-to-end performance for application workflows. In a very real sense, the legacy messaging technology becomes the bottleneck preventing organizations from taking true advantage of nimble cloud and microservices architectures.

In an attempt to work around these limitations, organizations find themselves forced to deploy multiple technologies to address their range of needs. But integrating disparate technologies for publish-subscribe messaging, message queuing, stream processing and related areas consumes significant effort from operations teams, and creates a fragile environment in which changes to individual components can easily cause failures and errors. Organizations also find themselves creating silos in order to address limits to the scale and performance of a single instance of these legacy technologies. This leads to significant management overhead, requiring additional effort in an attempt to attain consistent performance, security, and resiliency across these different instances. This approach also leads to duplicate data and additional costs that become increasingly painful as the scale of data and workloads grows.

These pains are fundamentally the result of technology limitations, pointing to the need for wholly new approaches. Incremental improvements in existing technologies simply aren’t sufficient, as they do not address the fact that many of the shortcomings are inherent results of architectures chosen and design decisions made long ago that are extremely difficult – if not impossible – to change. Moreover, a new approach and design offer the possibility of not just incremental improvement, but a significant leap forward in messaging.

What’s needed from a modern solution for connecting data in the era of cloud and microservices? The same thing that’s always been demanded of messaging:  ensuring performance, scalability, resiliency, elasticity and flexibility to support how data is processed and connected. What’s changed is the nature and scale of those demands in a world that’s grown significantly in both complexity and pace:

  • Performance and scalability.  To support the demands created by ever-growing amounts of data and number of applications and users accessing data, messaging solutions need to be able to deliver high throughput and low latency even at scales of many millions of messages per second and beyond.
  • Tunable resiliency.  Although not all applications have the same requirements for availability and data durability, requiring cumbersome external bolt-on tools for availability and data protection is not an approach that can succeed in complex environments. A modern messaging solution needs inherent resiliency, supporting the highest demands for availability and data durability out of the box, while also allowing users to apply less rigorous guarantees where appropriate.
  • Elasticity. To meet constantly varying workloads and take advantage of the flexibility of cloud and containerized resources, a modern messaging solution needs to be able to scale up and down on the fly, without disruption, to deliver the performance needed at an appropriate cost.
  • Flexibility.  Modern applications are simply more demanding of messaging solutions than legacy applications. Messaging solutions need to offer a broad range of capabilities and support a number of scenarios in a single solution. That includes not only supporting closely related scenarios such as publish-subscribe messaging and message queuing, but also supporting broader needs such as processing and analytics on data as it flows through the messaging solution.

New messaging platforms have emerged to meet these requirements. One such example is Apache Pulsar* which was initially developed by Yahoo! as the next-generation solution to move and connect data to support a broad array of applications including Yahoo Mail, Yahoo Finance and Flickr and contributed to open source.

Apache Pulsar started with a unique architecture, one designed to support performance and scale for a large number of users and applications. A key design decision was adopting the principle that has become increasingly common in cloud-native applications in other domains, decoupling compute and storage. By using a distributed, scalable broker layer connected to a distributed, scalable storage solution, Pulsar makes it possible to quickly and easily scale to meet the precise needs of current workloads. Unlike legacy messaging solutions, Pulsar can be scaled up and down on the fly, without requiring complex rebalancing or disruptions.

Pulsar also provides capabilities making it possible to consolidate multiple environments into a single system. For starters, not only does Pulsar support publish-subscribe messaging, it also supports message queuing scenarios. Pulsar was also designed to provide the performance, scale and isolation required to support large numbers of users and applications in a single system, rather than necessitating multiple deployments to separate workloads. Further, Pulsar provides processing capabilities that allow data transformations, cleansing, and analytics to happen on data in motion, making it possible to support streaming data pipelines and real-time analytics.

In short, data-driven enterprises now have a choice:  they can try to bend legacy messaging solutions to meet the requirements of cloud and microservice architectures, or they can adopt new messaging platforms designed with those modern requirements in mind.  An increasing number of organizations are recognizing it’s in fact time to rethink their approach for this new era.

 

Sign up for the free insideBIGDATA newsletter.

Leave a Comment

*

Resource Links: