Author: InfoQ mp.weixin.qq.com/s/EAyXDbPuY…

Push notifications are a very effective tool for getting users to receive events immediately. At Gojek, we handle more than 3 million orders a day across more than 20 products.

You can imagine how many notifications we push every day — about a million per hour. This article will describe the challenges we face in dealing with push notifications of this size and our solutions.

Size is only part of the equation, but at Gojek we also faced some unique problems.

Multiple applications

Gojek is not just one App. In addition to the user App, we also have GoLife App, driver App, merchant App and service provider App.

When our system pushes notifications, either to a user’s App (for example, a notification to GoLife won’t be pushed to Gojek) or to all apps (for example, a promotion notification).

Our system needs to be flexible enough to freely choose between broadcast and individual push.

Multiple notification service providers

Because our user App needs to support both iOS and Android platforms, it also needs to support multiple notification systems.

On Android platform, we use FCM (Firebase Cloud Messaging) and GCM (Google Cloud Messaging). On iOS, we use Apple Push Notification Service (APNS).

Each notification service provider provides a different API key and token for a different App. For example, GoLife and Gojek use different FCM API keys.

Multiple devices for one user

We allow a user to log in to multiple devices at the same time, so notifications need to be pushed to all the devices that the user is logged in to, which leads to the previous two problems:

  • Users can log in to multiple apps (Gojek and GoLife) on a single device;
  • Users may log on to multiple devices, each requiring a different notification service provider. For example, users can log in to Gojek on both Android and iOS devices.

Multiple services that require push notifications

Gojek uses a microservice architecture, and we want to make push notifications available to every service, without having to worry about multiple devices and multiple service providers.

To address the above issues and keep the API as simple as possible, our notification system is divided into three components:

  • Notification server – provides notification push API and pushes notifications to job queue;
  • Token store — stores devices and device token data for logged-in users;
  • Notification handler – Processes messages in job queues and sends them to notification service providers.

Each of these components addresses some of the above problems, and we’ll take a closer look at them.

The token is stored

After the user logs in to the App, the App invokes the token storage API using the device token and App ID.

These records are deleted when the user logs out.

The token store is used to determine which devices push notifications to the user.

Notification server

This is an HTTP server that provides an API for push notifications.

For simplicity, the API requires putting the user ID and App ID in the HTTP header and the notification information in the request body:

POST http://<base_url>/notification
user_id: <user_id>
application_id: <application_id>
{
  "payload": {},
  "title": "You driver is here"."message": "Please meet your driver at the pickup point"
}
Copy the code

The server retrieves all user device information from the token store and then schedules a scheduling job for each user device.

The notification server provides an external interface to the system, so the service that needs to push notifications just calls its API through the user ID, and the notification service takes care of the rest.

Job queue

We used RabbitMQ for the job queue and created separate queues for each App ID and notification type.

Assigning separate queues is important because we need to isolate failures for each App and notification type. For example, if the FCM token of com.gojek.app expires, it will not affect the job of com.gojek.life or com.gojek.driver.bike.

Notification handler

The processor process pulls messages from the job queue and sends them to the corresponding notification service provider.

To keep the code simple and able to support different service providers, we defined a unified interface:


type PushService interface {
  Push(ctx context.Context, m PushRequest) (PushResponse, error)
}
Copy the code

The Push method receives a request object and returns a response object.

The request object contains information about the recipient and notification, such as expiration time, title, and text.

typePushRequest struct {DeviceID String Title String Message String Payload map[string]interface{}Copy the code

The response message contains information about whether the notification was successfully sent:


type PushResponse struct {
  Success         bool
  ErrorMsg        string
}
Copy the code

The interfaces are then implemented for the different service providers. For example, the corresponding implementation of FCM and APNS looks like this:

typeStruct {FCMProvider struct { Such as API token and a URL endpoint} func (p * FCMProvider) Push (CTX context, context, m queue. The Message) (notification. PushResponse, Error) {// Send notification to FCM server}typeAPNSProvider struct {// Such as API token and a URL endpoint} func (p * APNSProvider) Push (CTX context, context, m queue. The Message) (notification. PushResponse, Error) {// Send notification to APNS server}Copy the code

The notification handler is responsible for selecting the corresponding notification service provider and sending the message to them.

conclusion

In the face of these challenges, we identified some of the common patterns and separated them into different services, turning a relatively complex problem into a series of simple and manageable services.

Whenever a core logic requires a different implementation, we separate it into a separate service:

  • Multiple device issues are resolved by token services;
  • Multiple App problems are solved by unified notification server interface;
  • Multiple notification service providers are resolved through separate job queues and notification handlers.

As a result, we built a system capable of handling over 1 million push notifications per hour.


Architecture Digest, a public account, publishes a great article every day in the field of architecture, covering the application architecture of first-tier Internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture and other hot fields.