takeaway

The reverse method in The StringBuilder class reverses the string in the StringBuilder. For example, the string “abcde” is reversed to “edCBA”.

The source code

Java. Lang. StringBuilder# reverse:

    @Override
    public StringBuilder reverse() {
        super.reverse();
        return this;
    }
Copy the code

< AbstractStringBuilder > < AbstractStringBuilder > < AbstractStringBuilder

Java. Lang. AbstractStringBuilder# reverse:

public AbstractStringBuilder reverse() {
        boolean hasSurrogates = false;
        int n = count - 1;
        for (int j = (n-1) >> 1; j >= 0; j--) {
            int k = n - j;
            char cj = value[j];
            char ck = value[k];
            value[j] = ck;
            value[k] = cj;
            if (Character.isSurrogate(cj) ||
                Character.isSurrogate(ck)) {
                hasSurrogates = true;
            }
        }
        if (hasSurrogates) {
            reverseAllValidSurrogatePairs();
        }
        return this;
    }
Copy the code

We parse at a hierarchical level:

1. Initialize variables

Initialize the Boolean hasSurrogates variable, representing the existence of supplementary character pairs. Initialize the right boundary of the array, n, equal to the string length count-1,count is a member variable of the class

2. Iterate over the swap elements

Int j = (n-1) >> 1; Note that the string starts from (n-1)/2. For example, n=8, there are 9 characters in the string. The fifth character does not need to be swapped. Starting at n/2, which is corner script 4, there is one more invalid loop for the fifth character: the j and n-j elements are swapped, character by character, and the hasSurrogates flag is set to true if there is a character in the additional character pair

3. Processing supplementary character pairs

If supplementary character pairs exist in the string, special processing is required for supplementary character pairs

4. Return the result

Supplementary character pair

In Java, a char takes up 16 bits, so a pure char can represent 2 ^ 16 characters, 65536. However, utF contains more than 65536 characters. The unicode standard-setting group came up with the idea of using two consecutive chars to represent a combination of two chars, known as supplementary character pairs, or surrogatePairs

So, given a char, how do you determine whether it represents a char or an element in a supplementary character pair? There is a set of norms that set high surrogate from U+D800 to U+DBFF to 1024 characters, and low surrogate from U+DC00 to U+DFFF to 1024 characters. Such consecutive char types represent supplementary character pairs.

In string reversal, after traversal, there might be exchanged, the location of the up and down into the on, so can’t be resolved, so if the reverse string contains supplementary characters for, needs to make another cycle, the inside of the strings that will be after the initial reversal supplementary characters of sequential adjustment, right on the back of the code is as follows

java.lang.AbstractStringBuilder#reverseAllValidSurrogatePairs

private void reverseAllValidSurrogatePairs() { for (int i = 0; i < count - 1; i++) { char c2 = value[i]; if (Character.isLowSurrogate(c2)) { char c1 = value[i + 1]; if (Character.isHighSurrogate(c1)) { value[i++] = c1; value[i] = c2; }}}}Copy the code

conclusion

For this, we have read the full source code for the Reverse method of StringBuilder, which internally calls the reverse method of the parent class. It is worth mentioning that this method is also called by the Reverse method of StringBuffer for reuse. In the inversion method, note that the start condition of the loop is (n-1)>>2, avoiding an invalid loop of intermediate elements. At the same time, we learned about the design of supplementary pairs, and because of the existence of supplementary pairs, we may need an additional loop to deal with invalid supplementary pairs due to inversion.

Original is not easy, I hope you can gain after reading ~

^-^ reprint please indicate the source oh ^-^