background

With the rapid development of the Internet today, social applications and messaging functions are very popular, occupying a large amount of network traffic. From dingding, wechat, Weibo and Zhihu to all kinds of App push notifications, messaging functions have become standard for almost all applications. According to the characteristics of the scene, we can classify the message scene into three categories: IM (Dingding, wechat), Feed flow (Weibo, Zhihu) and regular message queue. Therefore, how to develop a simple and efficient IM or Feed stream function has become a problem that many architects and developers have to face.

Timeline 1.0 model

For the messaging scenario, the table storage team built a TableStore-Timeline 1.0 data model (referred to as Timeline model) for the JAVA language. Based on the experience and understanding of the scene, the message scene is encapsulated into a data model, which provides the table structure design, read and write mode and other solutions for customers to use. Users only need to rely on the model API, directly ignore the architecture scheme between Timeline and the underlying storage system, and directly implement the business logic based on the interface. It can meet the special requirements of message sequence preservation, mass message storage and real-time synchronization in message data scenarios. Timeline 1.0 is an abstract data model defined on top of tabular storage. For details, see TableStore Timeline: Easily Building Ten-Thousand-level IM and Feed Streaming Systems.

Full-text retrieval, fuzzy query requirements

In the process of widely using the Timeline model of table storage, we gradually find the strong demand of full text retrieval and fuzzy query of message data. However, the online query ability of the original model has some shortcomings. As table storage supports SearchIndex capability, it is possible for Timeline model to support online full-text search and fuzzy query. Therefore, based on the original architectural design, we rebuilt the Timeline 2.0 model and introduced powerful query capabilities and new data management schemes.

The project code is now open source on GitHub: Timeline@GitHub.

The arrival of the 2.0 era

The 2.0 version of Timeline model released this time is not directly modified based on the 1.x version. It defines and encapsulates a new usage interface that is compatible with the original model architecture. Rebuilt and upgraded the new model, adding the following features:

  • Increased the management ability of Timeline Meta;
  • Based on the multi-index function, the Meta and Timeline full-text retrieval and multi-dimensional combination query capabilities are added.
  • You can set SequenceId manually or automatically.
  • Support multi-column setting of Timeline Identifier to provide group management capability of Timeline;
  • It is compatible with the Timeline 1.X model. The TimelineMessageForV1 example can read and write 1.X messages directly, and users can also copy the TimelineMessageForV1 example.

Architecture parsing



Timeline, as a data model directly supported by table storage, takes “simplicity” as its design goal, and its core module composition is relatively clear. Timeline tries to enhance users’ freedom of use so that users can choose a more appropriate implementation scheme based on their own scenario requirements. The module architecture of the model is shown in the figure above, which mainly includes the following important parts:

  • Store: Store, a concept of tables similar to databases. The model contains two types of stores: Meta Store and Timeline Store.
  • Identifier: A unique Identifier that identifies a single line of Meta and the corresponding Timeline Queue.
  • Meta: metadata used to describe the Timeline. The metadata description uses the free-schema structure and can contain any columns.
  • Queue: Queue is all Message queues corresponding to a single Identifier. Messages are stored in Queue units within a Timeline.
  • SequenceId: Indicates the sequence number of the message body in a Queue. The sequence number must be incremental and unique.
  • Message: The Message body transmitted in the Timeline. It is a free-schema structure and can contain any columns.
  • Index: An Index implemented based on SearchIndex. It is divided into Meta Index and Message Index for different stores. You can customize indexes for any column in the Meta or Message to provide flexible multi-condition combination query and search.

Performance advantage

Timeline model is a kind of scene data model abstracted and encapsulated based on Tablestore, so it has all the advantages of Tablestore itself. At the same time, combined with the interface of scene design, users can realize the business logic more intuitively and clearly, as summarized below:

  • Support massive data storage: Distributed architecture, high scalability, support 10PB level of messages.
  • Low storage cost: Table storage provides low-cost storage, pay-as-you-go, resource packs, and reserved Cu.
  • Data life cycle management: Different types (table level) of message data, different life cycles can be customized.
  • Extremely high write throughput: Has extremely high write throughput capacity, can cope with 00M TPS message writing.
  • Low latency read: Low latency of query messages, in the order of milliseconds.
  • Interface design: high readability, comprehensive and clear interface function.

Maven address

Timeline Lib

<dependency> <groupId>com.aliyun.openservices.tablestore</groupId> <artifactId>Timeline</artifactId> The < version > 2.0.0 < / version > < / dependency >Copy the code

TableStore Java SDK

Timeline model is directly provided as the basic data model in TableStore Java SDK >= 4.12.1, and table storage can be used directly by old users by upgrading SDK

< the dependency > < groupId > com. Aliyun. Openservices < / groupId > < artifactId > tablestore < / artifactId > < version > 4.12.1 < / version > </dependency>Copy the code

This guide

Initialize the

Initialize the Factory

The user takes SyncClient as parameter, initializes StoreFactory, and creates Meta data and Timeline data management Store through the factory. The implementation of error retry depends on the retry policy of SyncClient. Users can set SyncClient to retry. If you have special requirements, you can customize the policy (simply implement the RetryStrategy interface).

/ retry strategy set * * * * Code: configuration. SetRetryStrategy (new DefaultRetryStrategy ()); * */ ClientConfiguration configuration = new ClientConfiguration(); SyncClient client = new SyncClient("http://instanceName.cn-shanghai.ots.aliyuncs.com"."accessKeyId"."accessKeySecret"."instanceName", configuration);

TimelineStoreFactory factory = new TimelineStoreFactoryImpl(client);
Copy the code

Initialize the MetaStore

Construct Schema of meta table (including Identifier, MetaIndex and other parameters), create and obtain meta management Store through Store factory; The configuration parameters include: Meta Table name, index, table name, primary key field, index name, and index type.

TimelineIdentifierSchema idSchema = new TimelineIdentifierSchema.Builder()
        .addStringField("timeline_id").build(); IndexSchema metaIndex = new IndexSchema(); Metaindex. addFieldSchema(// configure the index field, type new FieldSchema("group_name", FieldType.TEXT).setIndex(true).setAnalyzer(FieldSchema.Analyzer.MaxWord)
        new FieldSchema("create_time", FieldType.Long).setIndex(true)); TimelineMetaSchema metaSchema = new TimelineMetaSchema("groupMeta", idSchema)
        .withIndex("metaIndex", metaIndex); / / set the index TimelineMetaStore TimelineMetaStore = serviceFactory. CreateMetaStore (metaSchema);Copy the code

Initialize TimelineStore

Construct the Schema configuration of the timeline table, including parameters such as Identifier and TimelineIndex. Create and obtain the timeline management Store through the Store factory. The parameters include Timeline table name, index, table name, primary key field, index name, and index type. Batch write messages, based on Tablestore DefaultTableStoreWriter to improve concurrency, users can set the number of thread pools according to their own needs.

TimelineIdentifierSchema idSchema = new TimelineIdentifierSchema.Builder()
        .addStringField("timeline_id").build(); IndexSchema timelineIndex = new IndexSchema(); TimelineIndex. SetFieldSchemas (arrays.aslist (/ / configuration index field, type the new FieldSchema ("text", FieldType.TEXT).setIndex(true).setAnalyzer(FieldSchema.Analyzer.MaxWord),
        new FieldSchema("receivers", FieldType.KEYWORD).setIndex(true).setIsArray(true))); TimelineSchema timelineSchema = new TimelineSchema("timeline", idSchema) autoGenerateSeqId () / / SequenceId set as listed on the way. The setCallbackExecuteThreads (5) / / set the initial number of threads Writer to 5. WithIndex ("metaIndex", timelineIndex); / / set the index TimelineStore TimelineStore = serviceFactory. CreateTimelineStore (timelineSchema);Copy the code

Meta management

Meta management provides interfaces such as add, delete, change, single line read, and multi-condition combination query. The multi-condition combination query function is based on multiple indexes. Only MetaStore with IndexSchema supports the combination query function. Index types can be LONG, DOUBLE, BOOLEAN, KEYWORD, and GEO_POINT. The attributes include Index, Store, and Array, which have the same meaning as multivariate indexes.

TimelineIdentifer is the unique Identifier that distinguishes the Timeline. Duplicate identifiers are overwritten.

/ * * * * * interface using the parameters/TimelineIdentifier identifier = new TimelineIdentifier. Builder (). AddField ("timeline_id"."group")
        .build();
TimelineMeta meta = new TimelineMeta(identifier)
        .setField("filedName"."fieldValue"); / * * * create Meta table (if set index will create indexes) * * / timelineMetaStore prepareTables (); /** * insert Meta data ** / timelineMetaStore. Insert (Meta); /** * TimelinemetaStore.read (identifier); /** * Timelinemetastore.read (identifier); /** * update Meta data ** / Meta. SetField ("fieldName"."newValue"); timelineMetaStore.update(meta); /** * Delete Meta data ** / timelinemetastore.delete (identifier); SearchParameter parameter = new SearchParameter(field()"fieldName").equals("fieldValue")); timelineMetaStore.search(parameter); /** * Retrieve by SearchQuery (SearchQuery is an SDK native type that supports all multiple index search criteria) ** / TermQuery = new TermQuery(); query.setFieldName("fieldName");
query.setTerm(ColumnValue.fromString("fieldValue")); SearchQuery searchQuery = new SearchQuery().setQuery(query); timelineMetaStore.search(searchQuery); / * * * delete Meta table (if there is a index, at the same time delete index) * * / timelineMetaStore dropAllTables ();Copy the code

Timeline management

Timeline management provides an interface for message fuzzy query and multi-condition query. The full-text retrieval of messages relies on multivariate indexes. Users only need to set the corresponding field index type as TEXT, and then the full-text retrieval of messages can be realized through the Search interface. Timeline management allows you to create, retrieve, and delete message tables.

** / SearchParameter SearchParameter = new SearchParameter(field()"text").equals("fieldValue")); TermQuery query = new TermQuery(); query.setFieldName("text");
query.setTerm(ColumnValue.fromString("fieldValue")); SearchQuery searchQuery = new SearchQuery().setQuery(query).setLimit(10); / * * * create Meta table (if set index will create indexes) * * / timelineStore prepareTables (); /** * Retrieve ** / timelinestore.search (SearchParameter); /** * Retrieve by the SearchQuery parameter (SearchQuery is an SDK native type that supports all multivariate index search criteria) ** / timelinestore.search (SearchQuery); ** * Timelinestore.flush (); /** * timelinestore.flush (); /** * Close Writer and thread pool in Writer ** / timelinestore.close (); / * * * delete Timeline table (if there is a index, at the same time delete index) * * / timelineStore dropAllTables ();Copy the code

The Queue management

A Queue is an abstraction of a single message Queue, corresponding to all messages of a single Identifier in a Store. Manage message queues of Identifer through Queue instances, support basic add, delete, change, single line query, range query interfaces, etc.

/ * * * * * interface using the parameters/TimelineIdentifier identifier = new TimelineIdentifier. Builder (). AddField ("timeline_id"."group")
        .build();
long sequenceId = 1557133858994L;
TimelineMessage message = new TimelineMessage().setField("text"."Timeline is fine.");
ScanParameter scanParameter = new ScanParameter().scanBackward(Long.MAX_VALUE, 0);
TimelineCallback callback = new TimelineCallback() {
    @Override
    public void onCompleted(TimelineIdentifier i, TimelineMessage m, TimelineEntry t) {
        // do something when succeed.
    }

    @Override
    public void onFailed(TimelineIdentifier i, TimelineMessage m, Exception e) {
        // dosomething when failed. } }; / * * * a single Identifier corresponding to the message queue * * / timelineQueue = timelineStore createTimelineQueue (Identifier); Timelinequeue.store (message); timelineQueue.store(sequenceId, message); // async, callback timelinequeue.storeAsync (message, callback); timelineQueue.storeAsync(sequenceId, message, callback); Timelinequeue.batchstore (message); timelineQueue.batchStore(sequenceId, message); Timelinequeue. batchStore(message, callback); timelineQueue.batchStore(sequenceId, message, callback); Timelinequeue.get (SequenceId); timelineQueue.getLatestTimelineEntry(); timelineQueue.getLatestSequenceId(); /** * Update message by SequenceId ** / message.setfield ("text"."newValue"); timelineQueue.update(sequenceId, message); timelineQueue.updateAsync(sequenceId, message, callback); /** * Delete messages according to SequenceId ** / timelinequeue.delete (SequenceId); /** * Timelinequeue.scan (scanParameter); /** * timelinequeue.scan (scanParameter);Copy the code

Expert service

The table stores a group of technical experts who are well versed in the field of Timeline and have unique insights in creating IM and Feed streaming scenarios. If you are:

  • Eager to find the Timeline field master to fight;
  • Research Timeline scene solution;
  • Preparing the Timeline scenario.
  • Interested in Tablestore products;


The original link

This article is the original content of the cloud habitat community, shall not be reproduced without permission.