The implementation of paging interface should be very common in the server-side development of partial business, such as various forms in the PC era, various feed flows and timeline in the mobile era.

For traffic control or user experience, large volumes of data are not returned directly to the client, but are returned through multiple requests through the paging interface.

The most common definition of a paging interface looks something like this:

router.get('/list'.async ctx => {
  const { page, size } = this.query

  // ...

  ctx.body = {
    data: []}})// > curl /list? page=1&size=10
Copy the code

Interface of the incoming request page, and to request a page number, my personal guess this may and all beginners contacts database about it – and the people I know, first contact MySQL, SQL Server, what is more, and the like SQL statements, at the time of the query is basically a paging conditions like this:

SELECT <column> FROM <table> LIMIT <offset>, <rows>
Copy the code

Or the similar Redis operation for zset is similar:

> ZRANGE <key> <start> <stop>
Copy the code

So it might be customary to create a paging request interface in a similar way, with the client providing both page and size parameters. There is nothing wrong with this approach. Tables on PCS and lists on mobile terminals can neatly display data.

However, this is a more general way of paging data and is suitable for data that does not have dynamic filtering conditions. However, if the data requires high real-time performance, there are a lot of filtering conditions, or it needs to be compared and filtered with other data sources, it will look weird to use such a processing method.

Problem with page + number paging interface

Take a simple example, our company has the live broadcast business, and there must be an interface such as the live broadcast list. And live the figures are very demands timeliness, similar hot list, list, these data sources are good data offline calculation, but this data will only be stored user’s identity or the identity of the studio, like a live studio to watch the number, length, the sentiment, this kind of data must be timeliness demanding, can’t be in the offline scripts for processing, So you need to get it when the interface requests it.

There is also a need for some validation when the client requests, for example some simple conditions:

  • Make sure the anchor is live
  • Ensure compliance of live broadcast content
  • Check the shielding relationship between users and anchors

These are impossible to do when the script is running offline, because it changes all the time, and the data may not be stored in the same location. The list data may come from MySQL, the filtered data needs to be obtained in Redis, and the user information related data is in XXX database. So these operations can not be a table query can be solved, it needs to be carried out in the interface layer, get multiple data for synthesis.

Using the above paging mode at this point, there is a very embarrassing problem. Maybe the user who accesses the interface is very angry and blocks all the anchors on the first page, which will result in 0 data returned by the actual interface, which is very scary.

let data = [] // length: 10
data = data.filter(filterBlackList)
return data   // length: 0
Copy the code

In this case, should the client display the data as no data or immediately request the second page of data?

So there are some cases where this paging design doesn’t meet our needs, and that’s when we found a command in Redis: scan.

Cursor + number of page interface implementation

The scan command is used to iterate over all keys in the Redis database, but because the number of keys in the data is uncertain (keys will be killed if executed directly online), and the number of keys changes during your operation, some may be deleted, and some may be added. Therefore, the scan command requires that a cursor be passed in, which is 0 on the first call. The scan command returns two items: the first item is the cursor that will be needed in the next iteration, and the second item is a set representing all keys returned in the current iteration. And scan is a regular expression that can be added to iterate over certain keys that satisfy a rule, such as all keys starting with temp_ : Scan 0 temp_*, scan does not actually match the key and return it to you according to the rules you specify. It does not guarantee that N data will be returned in an iteration, and there is a high probability that none will be returned in an iteration.

If we explicitly need XX pieces of data, multiple calls to the cursor are fine.

// Use a simple recursive implementation to get ten matching keys
await function getKeys (pattern, oldCursor = 0, res = []) {
  const [ cursor, data ] = await redis.scan(oldCursor, pattern)

  res = res.concat(data)
  if (res.length >= 10) return res.slice(0.10)
  else return getKeys(cursor, pattern, res)
}

await getKeys('temp_*') // length: 10
Copy the code

This usage gave me some ideas for implementing the paging interface in a similar fashion. However, putting this logic on the client side can make it very difficult to adjust the logic later. Need to issue version to solve, the old version of the new compatibility will also make the late modification of hands. So this logic is developed on the server side, and the client only needs to carry the cursor returned by the interface with it on the next interface request.

General structure

For the client, this is a simple cursor store and use. But the logic on the server side is a little more complicated:

  1. First, we need to have a function that gets the data
  2. Second, you need a function for data filtering
  3. There is a function to determine the length of the data and intercept it
function getData () {
  // Get data
}

function filterData () {
  // Filter data
}

function generatedData () {
  // Merge, generate, return data
}
Copy the code

implementation

Node.js 10.x has become LTS, so the sample code uses some of the new features of 10.

Because the list will most likely be stored as a set, similar to the set of user ids, in Redis is set or zset.

If the data source is from Redis, my recommendation is to cache a complete list globally, periodically update the data, and then use slice at the interface level to retrieve some of the data required for this request.

The sample code below P.S. assumeslistIs a set of unique ids stored in the data, and through these unique ids to obtain the corresponding detailed data from other databases.

redis> SMEMBER list
     > 1
     > 2
     > 3

mysql> SELECT * FROM user_info
+-----+---------+------+--------+
| uid | name    | age  | gender |
+-----+---------+------+--------+
|   1 | Niko    |   18 |      1 |
|   2 | Bellic  |   20 |      2 |
|   3 | Jarvis  |   22 |      2 |
+-----+---------+------+--------+
Copy the code

List data is cached globally

// The complete list is cached globally
let globalList = null

async function updateGlobalData () {
  globalList = await redis.smembers('list')
}

updateGlobalData()
setInterval(updateGlobalData, 2000) // 2
Copy the code

Gets an implementation of the data filtering data function

Since the scan example above is recursive but not very readable, we can use a Generator to help us achieve the following requirements:

// The function that gets the data
async function * getData (list, size) {
  const count = Math.ceil(list.length / size)

  let index = 0

  do {
    const start = index * size
    const end   = start + size
    const piece = list.slice(start, end)
    
    // Query MySQL to obtain user details
    const results = await mysql.query(`
      SELECT * FROM user_info
      WHERE uid in (${piece})
    `)

    // The functions needed for filtering are listed below
    yield filterData(results)
  } while (index++ < count)}Copy the code

These functions may obtain data from other data sources to verify the validity of the list data. For example, if user A has A blacklist of users B and C, user A needs to filter user B and C when accessing the interface. Or we need to judge the current state of a certain piece of data, such as whether the anchor has closed the live broadcast room or whether the push-stream status is normal, which may call other interfaces for verification.

// A function to filter data
async function filterData (list) {
  const validList = await Promise.all(list.map(async item => {
    const [
      isLive,
      inBlackList
    ] = await Promise.all([
      http.request(`https://XXX.com/live?target=${item.id}`), redis.sismember(`XXX:black:list`, item.id)
    ])

    // The correct state
    if(isLive && ! inBlackList) {return item
    }
  }))

  // Filter invalid data
  return validList.filter(i= > i)
}
Copy the code

A function that finally concatenates data

After the above two key functions are implemented, there needs to be a function to check and join the data. Used to decide when to return data to the client and when to initiate a new request for data:

async function generatedData ({ cursor, size, }) {
  let list = globalList

  // If a cursor is passed in, the list is intercepted from the cursor
  if (cursor) {
    // The function of + 1 is mentioned below
    list = list.slice(list.indexOf(cursor) + 1)}let results = []

  // Note that this is a for loop, not a map, forEach, etc
  for await (const res of getData(list, size)) {
    results = results.concat(res)

    if (results.length >= size) {
      const list = results.slice(0, size)
      return {
        list,
        // If there is still data, then we need to change this time
        // We return the ID of the last item in the list as a cursor, which explains why the interface entry indexOf has a + 1 operation
        cursor: list[size - 1].id,
      }
    }
  }

  return {
    list: results,
  }
}
Copy the code

A very simple for loop, the for loop is used to make the interface request process into serial, after the first interface request results, and determine that the data is not enough, need to continue to obtain data for filling, then the second request will be launched, to avoid additional resource waste. After obtaining the required data, you can return directly, the loop terminates, and subsequent generators are destroyed.

And put this function in our interface to complete the process:

router.get('/list'.async ctx => {
  const { cursor, size } = this.query

  const data = await generatedData({
    cursor,
    size,
  })

  ctx.body = {
    code: 200,
    data,
  }
})
Copy the code

The return value of such a structure would be, roughly, a list and a cursor, similar to the return value of scan, cursor and data. The client can also pass in optional size to specify the number of returns the interface expects at a time. However, such interface requests are necessarily slower than normal page+size paging (because normal pages may not return a fixed number of pieces of data on one page, which may perform multiple fetches internally).

However, for some of the strong real-time requirements of the interface, I personally think that this implementation will be more user-friendly.

The comparison between the two

Both are good forms of paging; the first is more common, and the second is not a panacea, although it may be better in some cases.

The first way may be more applied to the B side, some work orders, reports, archiving data and so on. The second possibility is that it will be better to use the C-end, after all, the products provided to users; On a PC page, it might be a paginated table with 10 items on the first page, 8 items on the second page, and 10 items on the third page, which is a disaster for the user experience. And in the mobile end of the page may be relatively good, similar to infinite rolling waterfall flow, but there will also be a user load 2 data, and a load of 8 data, in the non-home page such a situation is barely acceptable, but if the home page will appear 2 data, TSK TSK.

In the second way, the cursor method can ensure that each interface returns a size of data, if not enough, it indicates that there is no data behind. It’s a better experience for the user. (Of course, if the list has no filtering conditions and is just a normal presentation, then the first is recommended and there is no need to add logic.)

summary

, of course, this is only from the service side can do some paging related processing, but it is still not solve all the problems, similar to A list of some updates faster, rankings, such as the data can be in every second, may request for the first time, A user in the first ten, and the second time I request interface user A in 11th place, Then both interfaces have A record of user A.

In this case, the client also needs to do the corresponding de-processing, but such de-processing will lead to the reduction of data volume. This is a big topic and I’m not going to get into it. A simple way to deceive users is to request 16 interfaces at a time, display 10 of them, and store the remaining 6 in the local interface next time.

If there is any error, or about paging you have a better way to achieve, their favorite way, might as well exchange.

The resources

  • redis | scan