about

  • My blog: Louis Blog
  • SF column: Louis Front end In-depth class
  • JavaScript string all API full decryption

It takes 10 minutes to read the article.

As a basic communication bridge, strings are implemented by almost all programming languages (although C and C ++ do not). Most developers work with strings on an almost daily basis, and the language’s built-in String module greatly improves their productivity. JavaScript naturally inherits all of String.prototype’s methods by automatically boxing String literals into strings, simplifying the use of strings even more.

As of ES6, the string contains 31 standard API methods, some of which have high appearance rate and need to understand the principle; Some methods have high similarity and need to be carefully distinguished. Even some methods are inefficient and should be used sparingly. Let’s start with the String constructor method to help you get to grips with strings.

String constructor method

fromCharCode

The fromCharCode() method returns the string created using the specified Unicode sequence, that is, passing in the Unicode sequence and returning the string created based on it.

Syntax: fromCharCode(num1, num2,…) , the parameters passed in are all numbers.

Here’s a simple example that returns ABC, ABC, *, +, -, and / :

String.fromCharCode(65.66.67); // "ABC"
String.fromCharCode(97.98.99); // "abc"
String.fromCharCode(42); / / "*"
String.fromCharCode(43); / / "+"
String.fromCharCode(45); / / "-"
String.fromCharCode(47); / / "/"Copy the code

It seems that fromCharCode satisfies the requirement, but in fact, due to the inherent deficiency of JS language design (can only deal with UCS-2 encoding, that is, all characters are 2 bytes, but can’t deal with 4 bytes characters), a 4-byte character cannot be returned through this method. In order to make up for this defect, ES6 has added the fromCodePoint method, please read below.

fromCodePoint(ES6)

The fromCodePoint() method is based on the ECMAScript 2015 (ES6) specification and has the same functions and syntax as the fromCharCode method, which mainly extends support for 4-byte characters.

// "𝌆" is a 4-byte character. Let's look at its numeric form first
"𝌆".codePointAt(); / / 119558
// Call fromCharCode to parse it and return garbled characters
String.fromCharCode(119558); / / "팆"
// Call fromCodePoint to parse it
String.fromCodePoint(119558); / / "𝌆"Copy the code

In addition to extending the 4-byte support, fromCodePoint also specifies error handling that will throw RangeError: Invalid Code point… This means that any positive integer that does not fit within the Unicode character range (Unicode can hold up to 1114,112 code points) will throw an error.

String.fromCodePoint('abc'); // RangeError: Invalid code point NaN
String.fromCodePoint(Infinity); // RangeError: Invalid code point Infinity
String.fromCodePoint(1.23); // RangeError: Invalid code point-1.23Copy the code

Note: To learn more about Unicode, I recommend this article: Text Encoding and Unicode. To use this method in older browsers, refer to Polyfill.

raw(ES6)

The raw() method is based on the ECMAScript 2015 (ES6) specification. It is a tag function for a template string, similar to Python’s r and C#’s @ string prefixes, which are used to get the original literal of a template string.

Syntax: string.raw (callSite,… Substitutions), callSite is a “call point object” for template strings,…… Substitutions, which indicates the corresponding value of any of the interpolated expressions, is rather awkward to understand here. I’ll explain it populically below.

Here is the Python string prefix r:

# Prevent special strings from escaping
print r'a\nb\tc' # Still print out "a\nb\tc"
Use regular expressions in Python
regExp = r'(? <=123)[a-z]+'Copy the code

Here’s how to use string. raw as a prefix:

// Prevent special strings from escaping
String.raw`a\nb\tc`; // Output "a\nb\tc"
// Supports interpolated expressions
let name = "louis";
String.raw`Hello \n ${name}`;  // "Hello \n louis"
// Interpolated expressions can also be evaluated
String.raw` 1 + 2 =The ${1+2}, 2 * 3 =The ${2*3}`; / / "1 + 2 = 3, 2 * 3 = 6"Copy the code

There aren’t many scenarios where string. raw is called as a function. Here’s how to use it:

// when the raw property value of the object is a string, they are inserted from the second argument to the subscript 0,1,2,... After the elements of n
String.raw({raw: 'abcd'}, 1.2.3); // "a1b2c3d"
// when the raw property of an object is an array, they are inserted into the array starting with the second argument, with subscripts 0,1,2,... After the elements of n
String.raw({raw: ['a'.'b'.'c'.'d']}, 1.2.3); // "a1b2c3d"Copy the code

What about the String. Raw function’s subscript insertion? MDN has the following description:

In most cases, String.raw() is used with template strings. The first syntax mentioned above is only rarely used, because the JavaScript engine will call this with proper arguments for you, just like with other tag functions.

This means that string. raw, when called as a function, is basically the same as the ES6 tag template. As follows:

// The following is the implementation of the tag function
function tag(){
  const array = arguments[0];
  return array.reduce((p, v, i) = > p + (arguments[i] || ' ') + v);
}
// Review a simple tag template
tag`Hello The ${2 + 3 } world The ${2 * 3 }`; // "Hello 5 world 6"
// This is what you want to call
tag(['Hello '.' world '.' '].5.6); // "Hello 5 world 6"Copy the code

So when string. raw is called as a function, whether the object’s raw property value is a String or an array, the slot is natural, with subscripts 0,1,2… All elements of n are followed by slots (excluding the last element). In fact, it is equivalent to a tag function like this:

function tag(){
  const array = arguments[0].raw;
  if(array === undefined || array === null) {// Array == undefined
    throw new TypeError('Cannot convert undefined or null to object');
  }
  return array.reduce((p, v, i) = > p + (arguments[i] || ' ') + v);
}Copy the code

In fact, when string.raw is called as a function, the execution throws a TypeError if the first argument is not an object that conforms to the standard format.

String.raw({123: 'abcd'}, 1.2.3); // TypeError: Cannot convert undefined or null to objectCopy the code

Only Chrome V41 + and Firefox V34 + browsers currently implement this method.

String.prototype

As with any other object, all methods for a String instance come from String.prototype. Here are its properties:

writable false
enumerable false
configurable false

As you can see, a string property is not editable, and any attempt to change its property will throw an error.

attribute

String.prototype has two properties, as follows:

  • String. The prototype. The constructor to the constructor (String ())
  • String.prototype.length indicates the length of the String

methods

There are two types of string prototype methods, one htML-independent and one HTML-dependent. Let’s look at the first one. But no matter how powerful the string method is, it is not powerful enough to change the original string.

Html-independent methods

Common methods are, CharAt, charCodeAt, concat, indexOf, lastIndexOf, localeCompare, match, replace, search, slice, split, substr, Substring, toLocaleLow ErCase, toLocaleUpperCase, toLowerCase, toString, toUpperCase, trim, valueof, etc. CodePointAt, contains, endsWith, Normalize, repeat, startsWith, etc. ES6 supports, but also includes quote, toSource, trimLeft, trimRight and other non-standard.

We’ll give examples of how each method can be used. Unless otherwise specified, this method is compatible with all current major browsers by default.

charAt

The charAt() method returns the character at the specified position in the string.

Grammar: STR. CharAt (index)

Index is the index of the string (ranging from 0 to length-1). If this range is exceeded, an empty string is returned.

console.log("Hello, World".charAt(8)); // return the string o with subscript 8Copy the code
charCodeAt

CharCodeAt () returns the Unicode value of the character at the specified index.

Grammar: STR. CharCodeAt (index)

Index is an integer ranging from 0 to length-1. If it is not a value, it defaults to 0, and if it is less than 0 or greater than the length of the string, NaN is returned.

Unicode code points range from 0 to 1,114,111. The beginning 128 Unicode encodings are the same as the ASCII character encodings.

CharCodeAt () always returns a value less than 65,536. Since a high-order encoding unit needs to be represented by a pair of characters, to see its encoding completion character, you need to look at the values of charCodeAt(I) and charCodeAt(I +1). See fixedCharCodeAt for more information.

console.log("Hello, World".charCodeAt(8)); / / 111
console.log("Front-end Engineer".charCodeAt(2)); // 24037, you can see or view the Chinese Unicode encodingCopy the code
concat

The concat() method concatenates one or more strings together to form a new string and returns it.

Syntax: str.concat(string2, string3…)

console.log("Early".concat("On"."Good")); / / good morningCopy the code

However, concat’s performance is poor, and it is highly recommended to use assignment operators (+ or +=) instead of concat. The +” operator is about tens of times faster than Concat. (Data refer to performance test).

indexOf
lastIndexOf

The indexOf() method is used to find the first occurrence of a substring in the string, and returns -1 if there is none. This method is strictly case-sensitive and searches from left to right. LastIndexOf, on the other hand, looks right to left, and everything else is the same.

Syntax: str.indexof (searchValue [, fromIndex=0]), str.lastIndexof (searchValue [, fromIndex=0])

The default value is 0. If less than 0, the entire string is searched. If more than the length of the string, the method returns -1, and unless the string is empty, the length of the string is returned.

console.log("".indexOf("".100)); / / 0
console.log("IT changes the World".indexOf("The world")); / / 4
console.log("IT changes the World".lastIndexOf("The world")); / / 4Copy the code
localeCompare

The localeCompare() method is used to compare strings, returning a negative number if the specified string precedes the original, or a positive number or 0, where 0 means the two strings are the same. This method implementation depends on the specific local implementation and may have different returns in different languages.

Syntax: str.localecompare (str2 [, locales [, options]])

var str = "apple";
var str2 = "orange";
console.log(str.localeCompare(str2)); // -1
console.log(str.localeCompare("123")); / / 1Copy the code

It’s not currently supported in Safari, but Chrome V24 +, Firefox V29 +, Internet Explorer 11+, and Opera V15 + all implement it.

match

The match() method is used to test whether the string supports the rules for specifying regular expressions, and implicitly converts the string to a regular expression object using new RegExp(obj), even if it is passed a non-regular expression object.

Grammar: STR. Match (regexp)

This method returns an array of matches, or null if there are no matches.

describe

  • If the regular expression does not have the g flag, return the same result as regexp.exec (STR). And the returned array has an additional input property, which contains the original string, and an index property, which means that the matching string is indexed (starting at 0) in the original string.
  • If the regular expression contains the g flag, the method returns an array of all matches, or null if no matches are found.

Related RegExp method

  • To test whether the string matches the re, see regexp.test (STR).
  • If you only need the first match, see regexp.exec (STR).
var str = "World Internet Conference";
console.log(str.match(/[a-d]/i)); // ["d", index: 4, input: "World Internet Conference"]
console.log(str.match(/[a-d]/gi)); // ["d", "C", "c"]
// RegExp
console.log(/[a-d]/gi.test(str)); // true
console.log(/[a-d]/gi.exec(str)); // ["d", index: 4, input: "World Internet Conference"]Copy the code

Regexp.test (STR) returns true if it matches a single character. The regexp.exec (STR) method returns the first match with or without the g flag, and the result is the same as if the str.match(RegExp) method did not contain the G flag.

replace

This method has been covered before, see String.prototype.replace Advanced Skills for details.

search

The search() method, which tests whether a string object contains a regular match, is equivalent to the test method of regular expressions and is faster than the match() method. Search () returns the index of the first match of the regular expression in the string if the match is successful, or -1 otherwise.

The only difference is that search converts substrings into regular expressions by default, whereas indexOf does not. IndexOf does not process regular expressions.

Grammar: STR. Search (regexp)

var str = "abcdefg";
console.log(str.search(/[d-g]/)); // 3, matches the substring "defg", whereas d in the original string has an index of 3Copy the code

The search() method does not support global matching (the g parameter is included in the re), as follows:

console.log(str.search(/[d-g]/g)); // 3, which is the same as without gCopy the code
slice

The slice() method extracts a portion of the string and returns the new string. This method is somewhat similar to the array.prototype.slice method.

Syntax: str.slice(start, end)

The end parameter is optional. Start can be either positive or negative.

A positive value is used to intercept from the start index to the end position (excluding characters at the end position, or to the end of the string if end is omitted).

If the value is negative, the character is intercepted from the index of length+start to the position of end.

var str = "It is our choices that show what we truly are, far more than our abilities.";
console.log(str.slice(0.- 30)); // It is our choices that show what we truly are
console.log(str.slice(- 30)); // , far more than our abilities.Copy the code
split

The split() method splits the string into an array of substrings and returns that array.

Separator (separator, limit)

Separator is an optional separator, which can be either a string or a regular expression. If separator is omitted, the returned array contains one element of the original string. If separator is an empty string, STR will be split into an array of characters from the original string. Limit means to limit the length of the returned array by cutting the pre-limit elements from the returned array.

var str = "today is a sunny day";
console.log(str.split()); // ["today is a sunny day"]
console.log(str.split("")); // ["t", "o", "d", "a", "y", " ", "i", "s", " ", "a", " ", "s", "u", "n", "n", "y", " ", "d", "a", "y"]
console.log(str.split("")); // ["today", "is", "a", "sunny", "day"]Copy the code

Use limit to limit the size of the returned array, as follows:

console.log(str.split("")); // ["today"]Copy the code

Use the regular separator (RegExp separator), as shown below:

console.log(str.split(/\s*is\s*/)); // ["today", "a sunny day"]Copy the code

If the regular delimiter contains capture parentheses, the result of matching the parentheses will be included in the returned array.

console.log(str.split(/(\s*is\s*)/)); // ["today", " is ", "a sunny day"]Copy the code
substr

The substr() method returns a specified number of characters starting at a specified position in the string.

Syntax: str.substr(start[, length])

Start indicates the position where character interception begins. The value can be positive or negative. A positive value indicates the index of the start position; a negative value indicates the index of the length+start position.

Length indicates the length of the truncated character.

var str = "Yesterday is history. Tomorrow is mystery. But today is a gift.";
console.log(str.substr(47)); // today is a gift.
console.log(str.substr(- 16)); // today is a gift.Copy the code

Currently, Microsoft’s JScript does not support the negative index of the start parameter. If you want to support it in IE, please refer to Polyfill.

substring

The substring() method returns a substring between two indexes of the string.

Syntax: str.substring(indexA[, indexB])

IndexA and indexB indicate the index of the string, where indexB is optional. If omitted, the substring from indexA to the end of the string is returned.

describe

The substring intercepts the characters between indexA and indexB (not included), according to the following rules:

  • If indexA == indexB, an empty string is returned;
  • If indexB is omitted, the characters are extracted until the end of the string.
  • If any argument is less than 0 or NaN, it is treated as 0;
  • If any parameter is greater than length, it is treated as length.

If indexA > indexB, the substring executes as if the two parameters were swapped. Substring (0, 1) == str.substring(1, 0)

var str = "Get outside every day. Miracles are waiting everywhere.";
console.log(str.substring(1.1)); / / ""
console.log(str.substring(0)); // Get outside every day. Miracles are waiting everywhere.
console.log(str.substring(- 1)); // Get outside every day. Miracles are waiting everywhere.
console.log(str.substring(0.100)); // Get outside every day. Miracles are waiting everywhere.
console.log(str.substring(22.NaN)); // Get outside every day.Copy the code
toLocaleLowerCase
toLocaleUpperCase

The toLocaleLowerCase() method returns the string that called the method converted to lowercase values, with the conversion rules mapped to localized case. The toLocaleUpperCase() method is the value converted to uppercase.

Syntax: str.tolocalelowerCase (), str.tolocaleupperCase ()

console.log('ABCDEFG'.toLocaleLowerCase()); // abcdefg
console.log('abcdefg'.toLocaleUpperCase()); // ABCDEFGCopy the code
toLowerCase
toUpperCase

These two methods mean to convert the string to the corresponding lowercase, uppercase, and return, respectively. As follows:

console.log('ABCDEFG'.toLowerCase()); // abcdefg
console.log('abcdefg'.toUpperCase()); // ABCDEFGCopy the code
toString
valueOf

Both methods return the string itself.

Syntax: str.tostring (), str.valueof ()

var str = "abc";
console.log(str.toString()); // abc
console.log(str.toString()==str.valueOf()); // trueCopy the code

For objects, toString and valueOf are also very similar, with subtle differences. Try running the following code:

var x = {
    toString: function () { return "test"; },
    valueOf: function () { return 123; }};console.log(x); // test
console.log("x=" + x); // "x=123"
console.log(x + "=x"); // "123=x"
console.log(x + "1"); / / 1231
console.log(x + 1); / / 124
console.log(["x=", x].join("")); // "x=test"Copy the code

When the “+” operator is a number on one side, the object x tends to be converted to a number, and the expression calls valueOf first. If the join method on the array is called, the object x tends to be converted to a string, and the expression calls toString first.

trim

The trim() method clears whitespace at the beginning and end of the string and returns.

Grammar: STR. The trim ()

console.log(" a b c ".trim()); // "a b c"Copy the code

The trim() method is added to ECMAScript 5.1. It does not support browser versions below IE9.

if(!String.prototype.trim) {
  String.prototype.trim = function () {
    return this.replace(/^\s+|\s+$/g.' ');
  };
}Copy the code
codePointAt(ES6)

The codePointAt() method is based on the ECMAScript 2015 (ES6) specification and returns a non-negative integer for the value at a given position encoded in UTF-16.

Grammar: STR. CodePointAt (position)

console.log("a".codePointAt(0)); / / 97
console.log("\u4f60\u597d".codePointAt(0)); / / 20320Copy the code

To use this method in older browsers, refer to Polyfill.

includes(ES6)

The includes() method is based on the ECMAScript 2015 (ES6) specification and is used to determine if a string belongs to another character. Return true if so, false otherwise.

Syntax: str.includes(subString [, position])

SubString indicates the string to be searched. Position indicates the starting position of the current string. The default value is 0.

var str = "Practice makes perfect.";
console.log(str.includes("perfect")); // true
console.log(str.includes("perfect".100)); // falseCopy the code

In fact, the name of this method in Firefox 18~39 is Contains, but it is renamed to Includes () due to bug 1102219. It is currently only implemented in Chrome V41 + and Firefox V40 + browsers, please refer to Polyfill to use this method in other browsers.

endsWith(ES6)

The endsWith() method is based on the ECMAScript 2015 (ES6) specification. It is basically the same as contains() except that it is used to determine whether a string is the end of the original string. Return true if not, false otherwise.

Str.endswith (substring [, position])

Unlike the contains method, the default value of the position parameter is string length.

var str = "Learn and live.";
console.log(str.endsWith("live.")); // true
console.log(str.endsWith("Learn".5)); // trueCopy the code

Again, only Firefox V17 + currently implements this method. For other browsers, see Polyfill.

normalize(ES6)

The normalize() method is based on the ECMAScript 2015 (ES6) specification and normalizes the string to the specified Unicode normal form.

Grammar: STR. The normalize ([form])

The form parameter can be omitted. There are currently four Unicode normal forms: “NFC”, “NFD”, “NFKC”, and “NFKD”. The default value of the form is “NFC”. If the form passes an invalid parameter value, a RangeError is thrown.

var str = "\u4f60\u597d";
console.log(str.normalize()); / / how are you
console.log(str.normalize("NFC")); / / how are you
console.log(str.normalize("NFD")); / / how are you
console.log(str.normalize("NFKC")); / / how are you
console.log(str.normalize("NFKD")); / / how are youCopy the code

It is currently only implemented in Chrome V34 + and Firefox V31 +.

repeat(ES6)

The repeat() method is based on the ECMAScript 2015 (ES6) specification. It returns a new string that repeats the original string multiple times.

Grammar: STR. Repeat (count)

The count argument can only take numbers greater than or equal to 0. If the number is not an integer, it is automatically converted to an integer. If it is a negative number or other value, an error is reported.

var str = "A still tongue makes a wise head.";
console.log(str.repeat(0)); / / ""
console.log(str.repeat(1)); // A still tongue makes a wise head.
console.log(str.repeat(1.5)); // A still tongue makes a wise head.
console.log(str.repeat(- 1)); // RangeError:Invalid count valueCopy the code

This method is currently only implemented in Chrome V41 +, Firefox V24 + and Safari V9 +. For other browsers, see Polyfill.

startsWith(ES6)

The startsWith() method is based on the ECMAScript 2015 (ES6) specification. It determines whether the current string startsWith the given string, returning true if it does, false otherwise.

Syntax: str.startswith (subString [, position])

var str = "Where there is a will, there is a way.";
console.log(str.startsWith("Where")); // true
console.log(str.startsWith("there".6)); // trueCopy the code

This method is currently implemented in the following browsers; for other browsers, see Polyfill.

Chrome Firefox Edge Opera Safari
41 + 17 + ✔ ️ 28 + 9 +

Other nonstandard methods will not be covered, but see the exclamatory method in String.prototype.

Html-related methods

Other methods such as big, blink, bold, fixed, fontcolor, fontsize, italics, small, strike, sub, sup have all been abolished.

The anchor and Link methods will be covered next. The other repeal methods will not be covered.

anchor

The anchor() method creates an anchor tag.

Grammar: STR. Anchor (name)

Name specifies the name attribute of the a tag that is created. The anchor created with this method will become an element of the document.anchors array.

var str = "this is a anchor tag";
document.body.innerHTML = document.body.innerHTML + str.anchor("anchor1");  This is an anchor tagCopy the code

The link() method also creates an a tag.

Grammar: STR. The link (url)

Url Specifies the href attribute of the created A tag. If the URL contains special characters, it is automatically encoded. For example, “will be escaped as”. The a tag created with this method will become an element in the document.links array.

var str = "Baidu";
document.write(str.link("https://www.baidu.com")); // Copy the code

summary

There are a lot of similarities between some of the string methods, and it is important to distinguish their functionality and usage scenarios. Such as:

  • Substr and substring are two arguments that have the same function. The first argument has the same meaning but is used differently. The former can be negative, and the latter is implicitly converted to 0 if it is negative or non-integer. The second argument of the former indicates the length of the truncated string, and the second argument of the latter indicates the subscript of the truncated string. If the first parameter of substring is greater than the second parameter, the execution result is the result after the transpose.
  • The search method is similar to the indexOf method in that it returns the indexOf the first occurrence of a substring if it is found. Otherwise, it returns -1. The only difference is that search converts substrings into regular expressions by default, while indexOf does not.

Also, remember? Concat is not recommended because of efficiency.

In general, in a string, Common methods are charAt, indexOf, lastIndexOf, match, replace, search, slice, split, substr, substring, toLowerCase, toUpperCase, trim, and Valueof Such as these. Familiarize yourself with their grammar rules and you’ll be able to navigate strings.


That’s all for this article. If you have any questions or good ideas, please feel free to leave a comment below.

Louis

This paper links: louiszhai. Making. IO / 2016/01/12 /…

Refer to the article

  • ECMAScript ® 2015 Language Specification
  • String.prototype
  • valueOf() vs. toString() in Javascript