This article was originally published by AI Frontier.
Apache Pulsar’s multi-tenant messaging system


Author | Matteo Merli and Sijie Guo


As wise as a fool


Edit | Emily

“This article introduces Apache Pulsar as a messaging system tailored for large, multi-level enterprises.”


In previous blog posts, we covered several reasons for choosing Apache Pulsar as a messaging solution for enterprise-level real-time business. Future articles will delve into some of these enterprise-level features, such as persistent storage to prevent data loss, multi-tenant capabilities, multi-domain replication, and encryption and security.

This article focuses on the multi-tenant messaging capabilities in Apache Pulsar. Multi-tenancy refers to the ability to provide services to multiple tenants through a single software instance. A tenant is a group of users who share the same view of the system. As a messaging hub for the enterprise, Apache Pulsar has supported multi-tenancy since its inception, as the project was designed to meet Yahoo’s stringent requirements at a time when there were no open source systems available that could provide multi-tenancy, including commonly used log abstraction systems such as Apache Kafka. Creating multiple instances of Pulsar for multiple users or functions is often unacceptable because it makes it difficult for users to share data across different departments in real time, creating isolation.

As an enterprise-class messaging system, Pulsar’s multi-tenant capabilities are designed to meet the following requirements:

  • Ensure that stringent SLAs are met smoothly
  • Ensure isolation between different tenants
  • Enforce quotas for resource utilization
  • Provide per-tenant and system-level security
  • Ensure low cost operation and as simple administration as possible

Apache Pulsar meets these requirements in the following ways:

  • Obtain the required security by performing authentication, authorization, and ACLs (access control lists) for each tenant.
  • Enforce storage quotas for each tenant.
  • All isolation mechanisms are defined in terms of policies that can be changed at run time to reduce operational costs and simplify administration.

Introduction of Pulsar

To help you better understand Pulsar’s multi-tenant capabilities, first take a quick look at Pulsar’s message model.

As with many other publk-subscribe systems, applications that send data to Pulsar can be called producers, and applications that use data from Pulsar can be called consumers. A consumer application is sometimes called a Subscriber. Similar to publisher-subscriber in general, topics are also the core message construction of Pulsar. Roughly speaking, a topic can represent a channel for producers to add data, and consumers to pull data from a topic. A group of consumers can form a subscription for a topic. Different consumer groups can choose their preferred message consumer style for the same topic: Exclusive, Shared, or Failover. The different subscription modes are shown in Figure 1.

Figure 1: Subscription models for Pulsar: exclusivity, sharing, and failover

Pulsar has been designed from the beginning to support multi-tenancy. Topics can therefore be organized according to two resources associated with multi-tenancy: Property and Namespace. Assets represent tenants in the system. Tenants can configure multiple namespaces in their assets. Each namespace can contain any number of topics. The namespace is the basic management unit for each tenant in Pulsar. Users can set acLs for the namespace, adjust the replica number Settings, manage multi-region replication of message data across the cluster, control message expiration, and perform other important operation and maintenance operations.

Figure 2: A Pulsar deployment consists of three independent tenants

If you want to learn more about Pulsar, you are advised to read the introduction to Pulsar article. The mechanisms Pulsar uses to achieve multi-tenant capabilities are further discussed below.

security

In order to successfully implement multi-tenant capabilities, you first need to ensure that each tenant (a) can only access topics that he or she is entitled to, and (b) cannot access topics that he or she is not supposed to see or access. This is done through a Pluggable authentication and authorization mechanism.

In Pulsar, when a client connects to a message Broker, the Broker uses an authentication plug-in to create an identity for the client and then (possibly) assign a role token to the client. The role token is a string, such as admin or application-1, that can represent one or more clients. Role tokens can be used to control client permissions for production or consumption operations on specific topics and can be used to manage the configuration of tenant assets.

Pulsar supports two authentication providers by default: TLS Client Auth and Athenz, an authentication system developed by Yahoo. Users can also implement their own authentication providers, as described in Pulsar’s documentation.

After the authentication provider recognizes a client’s role token, Pulsar Broker uses an authorization provider to determine what the client is authorized to do. Authorization is managed at the asset level, which means that multiple authorization schemas (schemes) can be active at the same time in a Pulsar cluster. For example, a user can create a shopping asset and assign it a set of roles to apply to a shopping application used in the enterprise. And create an Inventory asset and apply it only to the inventory application. Permissions are managed at the namespace level, that is, within the asset. We can assign permissions to a set of operations for a specific role, such as produce and consume, for a namespace. Refer to Pulsar’s documentation for details on how to configure authorization and assign permissions to namespaces at the asset level.

Finally, authentication and authorization provide isolation between tenants, preventing tenants from accessing topics or performing operations that they do not have access to. Let’s take a look at how Pulsar isolates resources for tenants to meet their SLA requirements.

isolation

In addition to meeting security requirements through isolation, multi-tenant applications also need to meet SLA requirements, for which Pulsar also provides isolation for robustness and performance. This is done through soft isolation, such as disk quotas, flow control, and flow limiting. There is also hard isolation, such as isolating certain tenants within a subnet of brokers that provide services, and storage isolation using BookKeeper Bookie.

Before introducing the isolation mechanism in detail, let’s take a look at what an Apache Pulsar cluster looks like. Figure 3 shows a typical installation environment. The Pulsar cluster consists of a set of brokers (for serving public-subscribe traffic), Bookie (for message storage), and an Apache ZooKeeper responsible for overall coordination and configuration management. The Pulsar Broker is the component responsible for receiving and delivering messages, and the Bookie is the Apache BookKeeper server that provides persistent storage for messages before final consumption.

Figure 3: A typical Apache Pulsar environment.

Soft isolation

Brokers and Bookie are typically physical resources shared by multiple producers and consumers. To protect tenants and meet SLA requirements, Pulsar provides a number of different mechanisms for Broker and Bookie.

storage

Apache Pulsar uses Apache BookKeeper as the persistent storage system for messages. Each Bookie in Apache BookKeeper can often efficiently serve hundreds or thousands of ledgers (each Ledger is a fragment created for a topic). BookKeeper achieves this efficiency mainly because it is designed with I/O isolation in mind. Each Bookie has its own dedicated Journal (on its own dedicated disk drive) that handles all added writes in an aggregate manner. Messages are then periodically flushed in the background and stored on dedicated storage disk drives. This I/O architecture enables read and write operations to be isolated, which means that tenants can read as fast as possible to maximize I/O performance as the storage device can provide without affecting write throughput and latency.

In addition to I/O isolation, different tenants can configure different storage quotas for different namespaces. Pulsar also allows the tenant to continue to perform specified actions after the quota is exhausted, such as blocking continued production messages, throwing exceptions, or discarding old messages.

The Pulsar Broker

In addition to the mechanisms adopted at the Bookie level, Pulsar also provides different mechanisms at the Broker level to meet SLA requirements. First, all transactions in a Pulsar Broker can be done asynchronously, and there is an upper limit to the amount of memory each Broker can use. If the Broker’s CPU or memory usage exceeds its limit, traffic can be migrated (manually or automatically) to a less loaded Broker in a very short time. The load manager component in each Pulsar Broker is dedicated to this purpose.

Also note that Pulsar can quickly migrate traffic between brokers to meet SLA requirements because the service layer and storage layer of the system are separate. In this way the Broker can truly be stateless. Unlike other messaging systems, where message partitions can only be stored in a subset of brokers, Pulsar’s brokers do not need to store any data locally. The overhead of moving a topic from one Broker to another is minimized so that traffic can be rebalanced very quickly and can be protected more quickly for tenants.

Second, flow control protocols are deployed on both the producing and consuming sides of the message. On the production side, tenants can configure limits on the number of messages in transit for the Broker and Bookie to prevent users from publishing messages faster than the system can accommodate them. On the consumer side, tenants can limit the number of outstanding messages that the Broker delivers to the consumer.

Finally, on the consumer side, Pulsar can also limit the number of messages delivered to the consumer to a specified rate. This prevents consumers from consuming messages faster than the system can process them.

All of these software mechanisms ensure that both producer and consumer SLAs are properly met.

Hard to isolate

The main purpose of this mechanism is to ensure that Pulsar can efficiently share resources (Broker and Bookie) while meeting tenant SLA requirements. In some cases, however, applications also need to isolate physical resources. Pulsar satisfies this requirement by having the option to isolate certain tenants or namespaces into a subset of the Broker. This ensures that these tenants or namespaces have full access to all the resources available to the subset of brokers.

This option can also be used to experiment with different configurations, debug, or quickly respond to unexpected situations in a production environment. For example, a user may trigger the Broker to perform poor behavior, which in turn affects the performance of other tenants. At this point, the tenant can be physically isolated to a Broker subset that does not serve the traffic of other tenants until the situation is resolved with a deployment fix.

In addition to physically isolating traffic on the Broker, you can also isolate traffic on the Bookie used to store messages. To do this, you can configure the necessary Placement policies for namespaces.

The mechanisms Pulsar uses can be seen as lightweight versions of a multi-cluster environment for different tenants, but in reality, it’s often not necessary to set everything up separately. This enables physical isolation similar to that of a single cluster and simplifies operation and maintenance (O&M).

conclusion

Apache Pulsar is a true multi-tenant messaging system that provides varying degrees of isolation between different resources. This article described the various mechanisms Pulsar uses to implement multi-tenant capabilities, including security isolation through authentication and authorization, isolation of shared physical resources through flow control, flow limiting, and storage quotas, and isolation of physical resources through placement policies. Hopefully, this article has helped you better understand Apache Pulsar and its multi-tenant enterprise capabilities. Further articles will cover another enterprise-class feature of Apache Pulsar: multi-locale replication.

If you are interested in Pulsar, you can participate in the Pulsar community in the following ways:

  • Pulsar Slack channel, you can register here:
  • apache-pulsar.herokuapp.com/
  • Pulsar mailing list.

More general information on the Pulsar Apache projects, can visit the website: pulsar.incubator.apache.org/, and focus on the project’s Twitter account: @ apache_pulsar.

Read the original text in English:

Streaml. IO/blog/multi -…

For more content, you can follow AI Front, ID: AI-front, reply “AI”, “TF”, “big Data” to get AI Front series PDF mini-book and skill Map.