preface

I believe that as JAVAER, usually the most used encoding must be String String, and I believe that there should be many people familiar with the String API, but have not seen the source code implementation, in fact, I personally think for the use of API, the beginning of the stage is to see its official documents, And with the accumulation of development experience, should try to look at the source code implementation, which is crucial to their ability to improve. After you understand the source code, the use of API will be more handy!

Note: The following records are based on the JDK8 environment

String is just a class

String is really just a class, and we can roughly analyze it from the following angles:

  1. Class inheritance relation
  2. Class member variable
  3. Class constructor
  4. Class member methods
  5. Dependent static method

Inheritance relationships

The UML class diagram of String exported from the plugin of IDEA is as follows:

String implements Serializable, Comparable, and CharSequence. String implements Serializable, Comparable, and CharSequence

  • Serializable: The class that implements this interface will have serialization capability. This interface has no implementation, but only serves as a constant identifier.
  • Comparable: Classes implementing this interface have the ability to compare sizes, such as lists (and arrays) of objects implementing this interfaceCollectionsClass static methodssortDo automatic sorting.
  • CharSequence: character sequence unified I interface. Provides common operations on character sequences, usually someread-onlyMethod, many character-related classes implement this interface to operate on character sequences, such as:String.StringBufferAnd so on.

The String class is defined as follows:

public final class String
    implements java.io.Serializable.Comparable<String>, CharSequence{... }Copy the code

The final modifier indicates that the String class is an immutable class that cannot be inherited.

Class member variable

Here we mainly introduce the most critical member variable value[], which is defined as follows:

 /** The value is used for character storage. */
    private final char value[];
Copy the code

A String is a String composed of char characters, so a String is actually an array of characters, represented by value[]. Note that value[] is final, indicating that the value cannot be modified.

Class constructor

There are a number of overloaded String constructors, as described below:

  1. The null argument constructor initializes an instance of a string, defaulting to a null character. Theoretically you don’t need this constructor, but actually defines a null characterString = ""It will initialize an empty stringStringObject, and this constructor also takes null charactersvalue[]Copy it again, the source code is as follows:
      public String(a) {
         this.value = "".value;
     }
    Copy the code
  2. Constructed from a string argumentStringObject that will actually take the parametervaluehashAssigning a value to an instance object as initialization is equivalent to making a deep copy of a parameterStringObject, source code as follows:
      public String(String original) {
            this.value = original.value;
            this.hash = original.hash;
        }
    Copy the code
  3. Build a new one from an array of charactersStringObject, used hereArrays.copyOfMethod to copy an array of characters
     public String(char value[]) {
            this.value = Arrays.copyOf(value, value.length);
        }
    Copy the code
  4. Build a new character array from the source character array by intercepting the offset (starting position) and the number of charactersStringObject.
    public String(char value[], int offset, int count) {
            // If the offset is less than 0, an out-of-bounds exception is thrown
            if (offset < 0) {
                throw new StringIndexOutOfBoundsException(offset);
            }
            if (count <= 0) {
                // If the number of characters is less than zero, an out-of-bounds exception is thrown
                if (count < 0) {
                    throw new StringIndexOutOfBoundsException(count);
                }
                // If the offset is within the length of the string, a null character is returned
                if (offset <= value.length) {
                    this.value = "".value;
                    return; }}// Note: offset or count might be near -1>>>1.
            // If the offset is greater than the total character length - the truncated character length, an out-of-bounds exception is thrown
            if (offset > value.length - count) {
                throw new StringIndexOutOfBoundsException(offset + count);
            }
            // Use the array. copyOfRange static method to take an array of characters from offset, length offset+count, and assign them to the array of the current instance
            this.value = Arrays.copyOfRange(value, offset, offset+count);
        }
    Copy the code
  5. From the source integer array, a new one is constructed by intercepting the offset (starting position) and the number of charactersStringObject. This is an array of integersThe ASCII integer value corresponding to a character
        public String(int[] codePoints, int offset, int count) {
        // If the offset is less than 0, an out-of-bounds exception is thrown
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            // If the number of characters is less than zero, an out-of-bounds exception is thrown
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            // If the offset is within the length of the string, a null character is returned
            if (offset <= codePoints.length) {
                this.value = "".value;
                return; }}// Note: offset or count might be near -1>>>1.If the offset is greater than the total character length - the length of the truncated character, an out-of-bounds exception is thrown//if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
    
        final int end = offset + count;
    
        // Calculate the exact size of the character array n, filter out some invalid int data
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }
    
        // Initialize the array with the size calculated in the previous step
        final char[] v = new char[n];
        
        // Iterate over the fill character array
        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }
        // The array of characters assigned to the current instance
        this.value = v;
    }
    Copy the code
  6. Initialize by intercepting length from offset in the source byte arrayStringInstance, and you can specify a character encoding.
    public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        // If the character encoding argument is null, a null pointer exception is thrown
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        Static methods check whether the index of a byte array is out of bounds
        checkBounds(bytes, offset, length);
        // Use stringcoding. decode to decode a byte array into a string in a range, with length truncated from offset
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }
    Copy the code
  7. Similar to the sixth construct, except that the encoding parameter overload isCharsettype
      public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }
    Copy the code
  8. Construct an instance of a string from the source byte array, specifying the character encoding, by calling the sixth constructor, starting at 0 and truncating to the length of the byte array
     public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
        this(bytes, 0, bytes.length, charsetName);
    }
    Copy the code
  9. From the source byte array, construct an instance of a string, specifying the character encoding. The implementation actually calls the seventh constructor, starting at position 0, and truncating to the length of the byte array
     public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }
    Copy the code
  10. Initialize by intercepting length from offset in the source byte arrayStringExample, unlike the sixth constructor, uses the system default character encoding
     public String(byte bytes[], int offset, int length) {
        // Check if the index is out of bounds
        checkBounds(bytes, offset, length);
        // Decodes byte arrays into character arrays using the system default character encoding
        this.value = StringCoding.decode(bytes, offset, length);
    }
    Copy the code
  11. From the source byte array, construct an instance of a string, using the system default encoding, which actually calls the 10th constructor, starting at position 0, and truncating to the length of the byte array
    public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }
    Copy the code
  12. willStringBufferBuild a new oneStringWhat’s special about this method is that it hassynchronizedLocking allows only one thread on this at a timebufferTo build aStringObject that is thread-safe
     public String(StringBuffer buffer) {
        // Lock the current StringBuffer object
        synchronized(buffer) {
            // Copies the StringBuffer character array to the character array of the current instance
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length()); }}Copy the code
  13. willStringBuilderBuild a new oneStringUnlike the 12th constructor, this constructor is not thread-safe
     public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }
    Copy the code

Class member methods

  • Get the string length, actually get the character array length

      public int length(a) {
        return value.length;
    }
    Copy the code
  • To determine whether the string is empty, we need to determine whether the length of the complex character array is zero

    public boolean isEmpty(a) {
        return value.length == 0;
    }
    Copy the code
  • Gets characters based on index parameters

     public char charAt(int index) {
        // If the index is less than 0 or greater than the character array length, an out-of-bounds exception is thrown
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        // Returns an array of positional characters
        return value[index];
    }
    Copy the code
  • Get the specified character ASSIC (int) from the index argument

      public int codePointAt(int index) {
        // If the index is less than 0 or greater than the character array length, an out-of-bounds exception is thrown
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        // Return a classic at the index position (int)
        return Character.codePointAtImpl(value, index, value.length);
    }
    Copy the code
  • Return the ASSIC code (int) of the element preceding the index position.

    public int codePointBefore(int index) {
        // Get the index position of the element preceding index
        int i = index - 1;
        // Check if the index is out of bounds
        if ((i < 0) || (i >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return Character.codePointBeforeImpl(value, index, 0);
    }
    Copy the code
  • The codePointCount () method returns the number of code points, the actual number of characters, similar to length(). For a normal String, the length method is no different from codePointCount, which returns the number of characters. But there is a difference when strings are of Unicode type. For example: String STR = “/uD835/uDD6B” (even ‘Z’), length() = 2,codePointCount() = 1

     public int codePointCount(int beginIndex, int endIndex) {
        if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
            throw new IndexOutOfBoundsException();
        }
        return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
    }
    Copy the code
  • CodePointOffset = 5 codePointOffset = 5 codePointOffset = 5 codePointOffset = 5

    public int offsetByCodePoints(int index, int codePointOffset) {
        if (index < 0 || index > value.length) {
            throw new IndexOutOfBoundsException();
        }
        return Character.offsetByCodePointsImpl(value, 0, value.length,
                index, codePointOffset);
    }
    
    Copy the code
  • This is a private method that is called inside String because it has no access modifiers and only allows classes in the same package to access arguments: DST [] is the destination array, and dstBegin is the offset of the destination array. To copy the past starting position (from what position in the destination array), copy the entire String value into the DST character array

    void getChars(char dst[], int dstBegin) {
        System.arraycopy(value, 0, dst, dstBegin, value.length);
    }
    Copy the code
  • Gets an array of char characters by copying the characters of a string into the target character array using the getChars() method. Parameters: SrcBegin was the starting position of cend, srcEnd was the back of cend characters to be copied by the original string (the replication region did not include srcEnd). DST [] was the target character array, dstBegin was the copy offset of cend characters, The copied characters are overwritten starting at dstBegin in the target character array.

    public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
        if (srcBegin < 0) {
            throw new StringIndexOutOfBoundsException(srcBegin);
        }
        if (srcEnd > value.length) {
            throw new StringIndexOutOfBoundsException(srcEnd);
        }
        if (srcBegin > srcEnd) {
            throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
        }
        System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
    }
    Copy the code
  • Gets a byte array of strings, decoding the string to a byte array according to the specified character encoding

    public byte[] getBytes(String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null) throw new NullPointerException();
        return StringCoding.encode(charsetName, value, 0, value.length);
    }
    Copy the code
  • Gets a byte array of strings, decoding the string to a byte array according to the specified character encoding

    public byte[] getBytes(Charset charset) {
        if (charset == null) throw new NullPointerException();
        return StringCoding.encode(charset, value, 0, value.length);
    }
    Copy the code
  • Gets a byte array of strings, decoding the string to a byte array according to the system default character encoding

     public byte[] getBytes() {
        return StringCoding.encode(value, 0, value.length);
    }
    Copy the code

Simple summary

  • StringBy the modifierfinalModifier is an immutable class that cannot be inherited
  • StringimplementationSerializableInterface, can be serialized
  • StringimplementationComparableInterface that can be used to compare sizes
  • StringimplementationCharSequenceInterface, represents the sequential character sequence, to achieve a general character sequence method
  • StringIs a sequence of characters, and the internal data structure is actually an array of characters around which all operations are performed.
  • String“Is used frequentlySystemOf the classarraycopyMethod to copy an array of characters

The last

Due to the lack of space, the first summary of String will be here first, and the following part will be recorded again. We will push it to the official account [Zhang Shaolin] as soon as possible. Welcome to follow us!