This article was shared with revisions by qiuwenchen, engineer of wechat development team.
1, the introduction
Full-text search is a search method using inverted index. An inverted index, also known as an inverted index, creates an index for each Token in the input content, which stores the specific location of the Token in the content. Full-text search technology is mainly applied to the scene of searching a large amount of text content.
Wechat terminal involves a large number of text search business scenarios, including IM contacts, IM chat records, favorites search.
These features have been available since 2014, and the underlying technology has not been updated for years:
-
1) SQLite FTS3 is still used as the full-text search engine for chat records, but SQLite FTS5 is available now;
-
2) The simple Like statement is still used to match the text in the search of the favorites home page;
-
3) Contact search is even performed in memory (traversing all attributes of all contacts in memory for matching).
As users accumulate more and more IM chat data on wechat, it is increasingly urgent to improve the search technology at the bottom of wechat. Therefore, in 2021, we carried out a comprehensive upgrade of full-text search technology on wechat iOS terminal, and this paper mainly recorded the technical practice in the process of technological upgrade.
2. Series of articles
This is the fourth in a series of articles:
- “IM Full-text Search Technology (I) : The Way to Optimize full-text Search on wechat Mobile Terminal”
- “IM Full-text Search Technology (II) : Solutions to full-text Search polyphonics on wechat Mobile Terminal”
- IM Full-text Search Technology Topic (iii) : Practice of Chat Message Full-text Search Technology for Web END IM of NetEase Yunxin
- “IM Full-text Search Technology Topic (4) : The latest Full-text Search Technology Optimization Practice of wechat iOS Terminal” (* Article)
3. Selection of full-text search engine
There are not many full-text search engines available to iOS clients, including:
-
1) SQLite three versions of FTS components (FTS3 and 4, FTS5);
-
2) Lucene C++ implementation version CLucene;
-
3) Lucy, the C language bridge version of Lucene.
Here is a comparison of these engines in terms of transaction capability, technical risk, search capability, read and write performance, and so on (see figure below).
1) Transaction ability:
Lucene does not provide complete transaction capability because Lucene uses a multi-file storage structure that does not guarantee atomicity of transactions.
SQLite FTS component can inherit SQLite transaction capability perfectly because it uses common tables at the bottom.
2) Technical risks:
Lucene is mainly applied to the service end, and there is no large-scale application case on the client end. Moreover, both CLucene and Lucy have officially stopped maintenance since 2013, so the technical risk is relatively high.
SQLite’s FTS3 and FTS4 components are older versions of the SQLite engine, which are less officially maintained, and both versions put a one-word index into a single record, which in extreme cases risks exceeding SQLite’s maximum length limit for a single record.
SQLite’s FTS5 component as the latest version of the engine has been released for more than six years and has been fully used on android wechat, so the technical risk is minimal.
3) In terms of search ability:
Lucene has a much longer history than SQLite’s FTS component and has the most extensive search capabilities. Lucene, in particular, has a rich ranking mechanism for search results, but there is no application scenario for this in wechat client. Because our search results are either sorted by time or by some simple custom rules.
Among several versions of SQLite engine, FTS5 has a more complete and rigorous search syntax and provides many interfaces for users to customize search functions, so the search capability is relatively strong.
4) Read and write performance:
The following 3 graphs are the indexes with the Optimize state generated by different engines for 1 million randomly generated Chinese statements of length 10, where the occurrence frequency of each statement is the actual usage frequency of Chinese characters.
As can be seen from the above three figures, Lucene’s performance in reading hits is much better than SQLite’s, indicating that Lucene’s index file format is very advantageous, but wechat does not have an application scenario that only reads hits. Lucene’s other performance data are not significantly different from SQLite’s. SQLite FTS3 is similar to FTS5 for most of its performance, and FTS5 takes significantly more time to generate an index than FTS3, which is optimized.
Considering these factors: We choose SQLite FTS5 as the search engine for iOS wechat full-text search.
4. Engine layer optimization 1: Implement FTS5 Segment automatic Merge mechanism
SQLite FTS5 stores the contents of each transaction into a separate B-tree, called a segment, which stores the position offset of each word in the rowiD, column number, and field. So this segment is the inverted index of that content.
Multiple writes will form multiple segments. When querying, you need to query these segments separately and summarize the results. Therefore, the more segments there are, the slower the query speed will be.
To reduce the number of segments, SQLite FTS5 introduced the Merge mechanism. Merge an existing segment whose level is I into a new segment whose level is I +1.
An example of a merge is as follows:
FTS5 has two default merge operations:
1) **automerge: ** When a segment of a level reaches 4, it will automatically perform part of the merge operation when writing content, known as an automerge. The number of writes to each Automerge update is proportional to the number of writes to this update, and it will take several times for automerge to fully merge into a new segment. Automerge will clip the merged content of the old segment several times to introduce redundant writes before fully generating a new segment. **Copy the code
2) ** CRISismerge: ** When the number of segments at a level reaches 16, merge the segments at that level. **Copy the code
FTS5’s default merge operations are performed simultaneously at write time, which affects the performance of the business logic. In particular, CrisisMerge can occasionally cause a write operation to take a long time, which makes the performance of the business uncontrollable. This is mainly because FTS5’s merge operation is more time-consuming than the other two engines.
In WCDB, we implement FTS5’s segment automatic merge mechanism, centralize these merge operations into a single sub-thread, and optimize the execution parameters.
Specific practices are as follows:
-
1) Listen to the FTS5 index table changed to by each transaction of the database with FTS5 index, throw a notification to the sub-thread to trigger the automatic MERGE operation of WCDB;
-
Merge all FTS5 index levels whose segment number exceeds 1 perform a Merge;
-
3) Every 16 pages of data written by Merge, check whether the write operations of other threads are blocked because of Merge. If so, commit immediately to minimize the impact of Merge on service performance.
The flow chart for automatic merge logic execution is as follows:
Limiting the number of segments per level to 1 allows FTS5 query performance to be closest to the performance after optimize with an acceptable amount of writes. Assume that the data write volume is M for N times. After the merge is complete, the actual data write volume is MN(log2(N)+1). Service write in batches. Increasing M also reduces the total write volume.
In terms of performance, the query of three words for a table with 100w Chinese contents and 100 Chinese characters for each fts5 took 2.9ms in the state of optimize, and the query time was 4.7ms, 8.9ms and 15ms when the segment number of each level was limited to 2, 3 and 4 respectively. If 100 W data pieces are written at a time, merge takes 10 seconds according to the WCDB scheme.
The automatic Merge mechanism allows you to keep FTS5 indexes closest to Optimize without affecting index update performance, increasing search speed.
5. Engine layer optimization 2: Word segmentation optimization
5.1 Optimization of word segmentation performance
Word segmentation is the key module of full-text search. It can divide the input content into multiple tokens and provide the locations of these tokens, which are then indexed by search engine. SQLite’s FTS component supports custom word segmentation, which can be implemented according to business requirements.
Word segmentation methods of word segmentation can be divided into word segmentation and word segmentation. The former is simply to build a word-by-word index of the input content, while the latter needs to understand the semantics of the input content and build an index of the phrase with specific meaning. Compared with word segmentation, word segmentation has the advantage of reducing both the number of tokens for index building and the number of tokens for matching in search. However, the disadvantage is that semantic understanding is required, and incomplete words entered by users may cause problems in search failure.
In order to simplify the client logic and avoid the problem that users can’t find when they miss input content, iOS wechat before the FTS3 word segmentation OneOrBinaryTokenizer is to use a clever word segmentation algorithm, in addition to the input content word-for-word index, but also every two consecutive words in the content index. The search content is segmented by every two words.
Here’s an example of searching for the same content with “Beijing welcomes you” :
Compared with simple word segmentation, this word segmentation method has the advantage of reducing the number of matching tokens by nearly half, improving the search speed and improving the search accuracy to some extent (for example, the search “welcome to Beijing” will not match “Beijing welcome to you”). The disadvantage of this word segmentation method is that it saves a lot of index content. Each word of the basic input content is saved three times in the index, which is a way to trade space for time.
Because the OneOrBinaryTokenizer doesn’t look very good at giving you nearly three times the increase in index content in exchange for less than two times the performance improvement, we have re-developed a new VerbatimTokenizer on FTS5, which uses only the basic VerbatimTokenizer. Redundant index content is not saved. At the same time, during the search, every two words are quoted to form a Phrase. According to the search syntax of FTS5, the words in the Phrase will be matched only when they appear adjacent to each other in sequence, which achieves the same search precision as OneOrBinaryTokenizer.
VerbatimTokenizer’s segmentation rules are shown below:
5.2 Word segmentation capability expansion
VerbatimTokenizer**** also implements five scalability capabilities based on wechat’s actual business needs to improve the fault tolerance of search:
1) Support the conversion of traditional Chinese characters into simplified Chinese characters in word segmentation: in this way, users can use traditional Chinese characters to search the content of simplified Chinese characters, and use simplified Chinese characters to search the content of traditional Chinese characters, avoiding the problem of user typing errors because the simplified Chinese characters and traditional Chinese characters are similar.
2) Unicode normalization: Unicode supports different encoding for characters with the same glyphs. For example, e with \ue9 and E with \u65\u301 have the same glyphs, which can cause users to search for things that look the same and cannot be found. Unicode normalization is the representation of characters with the same glyph by the same code.
3) support for filtering symbols: in most cases, we do not need to support indexed to symbols, symbols of large amount of repetition and users generally do not use symbols to search the content, but the contact search this business scenario requires support symbol search, often occurs because the user nickname in performing words, symbols of usage is not low.
4) Porter Stemming algorithm is supported for English words: The advantage of Stemming is that the singular, plural and tense of the content searched by users are inconsistent with the matched content, making it easier for users to find the content. For example, if the user wants to search for “happyday”, entering “happy” as a prefix to search will not be found, because “happyday” becomes “happydai” and “happy” becomes “happi”, the latter cannot become the prefix of the former. This kind of badcase is easy to occur when the content is stitched together with multiple English words. The stitched English of the contact nickname is common, so the stem is not taken in the index of the contact and is used in other business scenarios.
5) Support all letters to lowercase: so that users can use lowercase to uppercase search, and vice versa.
These extensibility capabilities, which transform each word in the indexed and searched content, can also be done at the business layer, where the Unicode normalization and simplification conversions were previously implemented.
But this has two drawbacks:
-
1) One is that the business layer needs to traverse the content once for every transformation, introducing redundant calculation;
-
2) One is that the content written into the index is the changed content, so the search result is also the changed content, which will be inconsistent with the original text, and the business layer is prone to make mistakes when judging the content.
For these two reasons, VerbatimTokenizer brings these transformation capabilities together in the VerbatimTokenizer.
Engine layer optimization 3: Index content supports multiple levels of delimiters
SQLite FTS index does not support adding new columns after the table is being built. However, with the development of business, more and more attributes of business data can be searched. How to solve the search problem of new attributes?
Especially in the contact search scenario, a contact supports many search fields.
An immediate idea is to create an index by concatenating old and new attributes with delimiters.
However, this will introduce a new problem: FTS5 matches the content of the whole field as a whole. If the user searches for matched tokens in different attributes, the data will also be matched. This result is obviously not what the user wants, and the accuracy of the search result will be reduced.
We need to search for matched tokens that do not have separators in the middle. This ensures that all matched tokens are in one attribute. At the same time, in order to support flexible business expansion, it is necessary to support multi-level separators, and the search results should support the level and position of obtaining matching results, as well as the original text and matching words of the content.
This capability is not available in FTS5, and the customized auxiliary function of FTS5 supports obtaining the location of each hit Token in all hit results during search. Using this information, we can infer whether there are delimiters among these tokens and the hierarchy of these tokens. So we developed the new FTS5 search helper function SubstringMatchInfo to implement this capability.
The general execution flow of this function is as follows:
Application layer optimization 1: database tabular optimization
7.1 How to Save Non-text Search Contents
In practical applications, in addition to save in the database we need to search the text of the index of FTS, also need to save the extra text of the corresponding business data id, used to sort the results of the properties (common business data creation time) and other need directly read out the contents of the following search results, these are not participate in the contents of a text search.
According to different storage locations of non-text search contents, we can divide FTS index table into two forms:
1) The first way:
The non-text search content is stored in an additional plain table that holds the mapping between the Rowid and the non-text search content of the FTS index, while each row of the FTS index only holds the searchable text content.
The table looks something like this:
The advantages and disadvantages of this form are obvious, respectively:
-
A) Advantages: the content of FTS index table is very simple, students who are not familiar with the configuration of FTS index table are not prone to mistakes, and the common table has good scalability, supporting the addition of new columns;
-
B) Disadvantages: The Rowid of the common table needs to be read with the Rowid of FTS index during the search, so that other contents of the common table can be read. The search speed is slower, and joint table query is required during the search, and the search SQL statement is slightly complicated.
2) The second way:
The non-text search content is directly stored in FTS index table together with searchable text content.
The table looks something like this:
The advantages and disadvantages of this approach are the opposite of the previous approach:
-
A) Advantages: fast and simple search;
-
B) Disadvantages: Poor scalability and need more detailed configuration.
Because iOS wechat used to use the second form, and wechat’s search business has been stable and there will be no big change, we now pursue more search speed, so we still continue to use the second form to store full-text search data.
7.2 Avoiding Redundant Index Contents
FTS index table by default for each column in the table of contents are inverted index, even the digital content will also be in accordance with the text, which can cause we saved in FTS index table of the text search content index is built, which increases the size of the index file, index update time-consuming and time-consuming search, this is obviously not what we want.
FTS5 supports adding an UNINDEXED constraint to columns in indexed tables so that FTS5 doesn’t index that column, so adding this constraint to all columns other than searchable text content can avoid redundant indexes.
7.3 Reducing the Index Size
As mentioned earlier, the inverted index mainly stores the roWID for each Token in the text, the column number, and the position offset for each occurrence in the field. The row number is automatically assigned by SQLite, and the position offset is based on the actual content of the business. We can’t decide either of these, but the column number can be adjusted.
In FTS5 indexes, a Token’s index content in a row looks like this:
As you can see, if we set the searchable text content in the first column (for multiple searchable text columns, put the columns with more content in the first column), we can save less column separator 0x01 and column number, which can significantly reduce the index file size.
So our final form looks like this:
7.4 Comparison of effects before and after optimization
Here is the comparison of average index file size per user before and after iOS wechat optimization:
8. Application layer optimization 2: index update logic optimization
8.1 an overview of the
In order to decouple the full-text search logic from the business logic, the FTS index of iOS wechat is not stored in the database of each business, but centrally stored in a dedicated full-text search database. After the data of each business is updated, the full-text search module will be notified asynchronously to update the index.
The overall process is as follows:
In this way, the index update can not only slow down the service data update speed, but also avoid index data update errors or even index data damage, so that the full-text search function module can be fully independent.
8.2 Ensure the consistency of indexes and data
The benefits of separating and asynchronously synchronizing business and index data are many, but difficult to implement.
The most difficult problem is how to ensure that the business data and index data are consistent, that is, to ensure that the business data and index data correspond to each other, no more, no less.
IOS wechat once stepped on a lot of holes here, hit a lot of patches can not completely solve this problem, we need a more systematic approach to solve this problem.
To simplify the problem, we can split the consistency problem into two parts:
-
1) To ensure that all business data are indexed, the user’s search results will not be missing;
-
2) The second is to ensure that all indexes correspond to a valid business data, so that users will not find invalid results.
To ensure that all business data is indexed, the first step is to find or construct a continuously growing data that describes the progress of business data updates, and the progress data updates and business data updates can ensure atomicity. And according to this progress interval, we can take out the content of business data update, so that we can rely on this progress to update the index.
In the business of wechat, the progress data of different businesses are different:
-
1) Chat logs are rowid using messages;
-
2) Favorites are updateSequence synchronized with the background.
-
3) The contact cannot find the continuously increasing progress data (we mark the wechat signals of newly added or updated contacts in the contact database as the index update progress).
For point 3 above, progress data can be used as follows:
Regardless of whether the business data is saved successfully, whether the update notification reaches the full-text search module or whether the index data is saved successfully, this index update logic can ensure that the saved business data can be successfully built into the index.
One of the key points is that data and progress are updated together in the same transaction and stored in the same database to ensure atomicity of data and progress updates (the database created by WCDB cannot guarantee atomicity of transactions between different databases because of the WAL schema).
There is another operation that is not drawn in the diagram. Specifically, if the business progress is found to be less than the index progress when wechat is started, it generally means that the business data has been reset after damage. In this case, the index should be deleted and the index progress should be reset.
Each index corresponds to valid business data, which requires that indexes must be deleted after business data is deleted. Currently, service data deletion and index deletion are asynchronous, and indexes may not be deleted after service data deletion.
This situation leads to two problems:
-
1) First, redundant indexes will slow down the search speed, but the probability of this problem is very small, and the impact can be ignored;
-
2) Second, it will cause users to search for invalid data, which should be avoided.
For point 2 above) : because of the need to completely delete all invalid index cost is higher, so we adopted the method of checking of inert to solve this problem, the particular way is to display the search result to the user, to check whether the data is valid, invalid words do not display the search results and asynchronous delete the corresponding index. Because the user can see very little data on a screen, the performance cost of checking the logic is negligible. And this check operation is actually not extra logic, for the sake of the flexibility of the search results display, we also need to read the business data during the display of the search results, so as to do the data validity check.
8.3 Optimization of index Creation Speed
The index is only used in search, and its update priority is not as high as that of business data. You can save as much business data as possible before building indexes in batches.
Bulk indexing has three benefits:
-
1) Reduce the number of disk writes and improve the average index building speed;
-
2) In a transaction, the result of SQL statement parsing can be used repeatedly, which can reduce the number of SQL statement parsing and improve the average index building speed;
-
3) Reduce the number of generated segments to reduce read and write consumption caused by Merge segments.
Of course: do not reserve too much business data without indexing, so that users will not have time to index, resulting in incomplete search results.
With the Segment automatic Merge mechanism above, the write speed of index is very controllable. As long as you control the volume, you do not have to worry about the time-consuming problem caused by the batch build index.
Considering the index building speed of low-end machines and the pull-up time of search pages, we determined the maximum number of index data to be built at 100.
At the same time: We will cache in memory the unindexed business data generated during the operation of wechat. In extreme cases, we will provide relative memory search for the business data that has not had time to build indexes to ensure the integrity of search results. The unindexed data generated by cache during the last wechat operation requires additional disk I/O. Therefore, after wechat is started, the index creation logic is triggered to create an index for the existing unindexed service data.
To summarize, there are three opportunities to trigger index building:
-
1) The number of unindexed business data reaches 100;
-
2) Enter the search interface;
-
3) Wechat starts.
8.4 Index Deletion Optimization
Index deletion speed is often overlooked when designing index update mechanisms, because the amount of business data to be deleted is easily underestimated and can be mistaken for a low-probability scenario.
However, the actual service data deleted by users may reach 50%, which is a major scenario that cannot be ignored. SQLite does not support parallel writes, and index deletion performance indirectly affects index write speed, introducing uncontrollable factors into index update.
Because the index is deleted with the ID of the business data to delete.
So there are two ways to speed up index deletion:
-
1) Build a common index of ROWID of FTS index with business data ID;
-
2) Remove UNINDEXED constraint of the column of business data Id from FTS index table, and add inverted index to business data Id.
Inverted indexes are not as efficient as normal indexes for two reasons:
-
1) The inverted index also carries a lot of extra information compared with the ordinary index, so the search efficiency is lower;
-
2) If multiple business fields are required to determine an inverted index, the inverted index cannot build a joint index, but can only match one of the business fields, and the other fields are traversal matching. In this case, the search efficiency will be very low.
8.5 Comparison of effects before and after optimization
The index performance data of chat records before and after optimization is as follows:
The collected index performance data before and after optimization are as follows:
Application layer optimization 3: optimization of search logic
9.1 the problem
When users search for content on the homepage of iOS wechat, the interaction logic is as follows:
As shown in the figure above: After the user changes the contents of the search box, all search tasks are launched in parallel. After each search task is executed, the search results are returned to the main thread for display on the page. This logic continues to repeat as the user changes the search content.
9.2 A search task can be executed in parallel
Although different search tasks now support parallel execution, the data volume and search logic of different services vary greatly, and tasks with large data volume or complex search logic will take a long time, which cannot give full play to the parallel processing capability of mobile phones.
We can also introduce parallel processing capabilities into a single search task in two ways:
1) for the search of large quantities of data business (such as chat record search) : divide index data can be stored in multiple FTS index table (note that there will not divide the short board effect), such searches can parallel search each index table, and then summarize each table search results, ordered to undertake unity. The number of index tables to be split here should not be too much or too little. Too many will exceed the actual parallel processing capability of the phone and affect the performance of other search tasks. Too few will not make full use of the parallel processing capability. In the past, wechat used ten FTS tables to store chat record indexes, but now it uses four FTS tables.
2) search for complex business logic (such as contact search) : can carry out the independent search logic can be executed in parallel (for example: in the contact search task, we will contact the plain text search, pinyin search search, labels, and regions, many members of the group search parallel execution, search and merge the results sorted after). Why not disassemble the watch here too? In such a scenario with a small number of search results, the search time is mainly concentrated in the link of searching index, which can be regarded as a B-tree. If a B-tree is divided into multiple ones, the search time will not decrease proportionally.
9.3 Search tasks should support interruption
A user may automatically initiate a search task for several times while continuously entering content in the search box. If a user initiates a search task before the execution of the previous search task is complete, the two search tasks affect each other’s performance.
This is especially true when the user’s input is going from short to long, because short search text leads to more hits, making the search task more time-consuming and thus more likely to run into subsequent search tasks. Doing too many tasks at once can also cause your phone to get hot and burst its memory.
Therefore, we need to interrupt the search task support at any time, so that we can interrupt the previous search task when the next search task is launched, to avoid the problem of too many tasks.
The interruption of search task is realized by setting a CancelFlag for each search task. When the search logic is executed, the CancelFlag will be set to determine whether the CancelFlag is set. If the CancelFlag is set, the task will exit immediately. External logic can interrupt the search task by CancelFlag.
The logical process is shown in the figure below:
In order for the search task to be interrupted in a timely manner, we need to check CancelFlag as evenly as possible, which is achieved by avoiding the OrderBy clause for sorting results when searching. Because FTS5 does not support joint indexing, when using the Order Derby clause, SQLite sorts through all the matching results before printing the first result, making the output of the first result almost as long as the output of the entire result, making the interrupt logic meaningless.
Without using the OrderBy clause, two restrictions are added to the search logic:
1) Read all the results from the database before sorting: we can read all the fields used for sorting when reading the results, and then sort all the results after reading all the results. Because sorting takes only a small percentage of the total search time, and the sorting algorithm performs almost the same, the effect on search speed is negligible.
2) Don’t use segmented queries: In the full-text search scenario, segmented queries are useless. Because segmented queries sort results, and sorting results traverses all results, segmented queries do not reduce search time (unless they are segmented according to the Rowid of the FTS index, which does not contain actual business information).
9.4 Search and read contents should be minimized
The amount of content read during a search is also a key factor in determining the duration of the search.
The FTS index table is actually composed of several SQLite common tables, some of which store the actual inverted index content, and one table stores all the original text saved by the user to the FTS index table. When reading something other than Rowid while searching, you need to use Rowid to read the contents of the table that holds the original text.
The internal execution of the index table output is as follows:
The less you read, the faster the output, and the more you read, the more memory you consume.
The method we adopt is: only read the business data ID and the business attributes used for sorting when searching. After sorting, when the results need to be displayed to users, read the specific content of the business data as required with the business data ID. It also scales well, allowing you to constantly adjust what the search results show to the needs of individual businesses without changing what’s stored.
One more thing to note: try not to read highlights when searching (SQLite’s highlight function has this capability). In order to obtain the highlighted field, it is not only necessary to read the original text, but also to segment the original text again, so as to locate the original content of the hit position. In the case of many search results, the consumption of word segmentation is very obvious.
So how do you get highlighted matches when displaying search results? The approach we adopted was to segment the user’s search text, then find the position of each Token in the display text when displaying the results, and then highlight that position (again, since the number of results the user sees on a screen is very small, the performance cost of the highlighting logic here is negligible).
Of course, in cases where the search rules are complex, it is convenient to read the highlighted information directly (for example, contact search uses the SubstringMatchInfo function mentioned earlier to read the highlighted content). Here, the main reason is to read the level and position of the matching content for sorting, so the operation of re-partitioning results one by one is inevitable.
9.5 Comparison of effects before and after optimization
The following is the comparison of the search time before and after the optimization of wechat search services:
10. Summary of this paper
At present, iOS wechat has fully applied this new full-text search technology scheme to the search business of chat records, contacts and favorites.
After using the new scheme: full-text search index file occupies less space, index update time is less, the search speed is faster, it can be said that full-text search performance has been comprehensively improved.
Open source IM framework source: github.com/JackJiang20…
(This article is simultaneously published at: www.52im.net/thread-3839…)