This is the first day of my participation in the More text Challenge. For details, see more text Challenge

string

String concatenation can cause surprising performance problems

String concatenation method

methods The sample instructions
The + operator str = “a” + “b” + “c” add
The += operator str = “a” ; str += “b” ; str += “c” And etc.
array.join() str = [“a”, “b”, “c”].join() Array variable string
string.concat() str = “a” ; str = str.concat(“b”, “c”) Concat Connection method

These methods are fast when concatenating a small number of strings.


Plus (+) and plus (+=)

Var STR = ”

str += 'one' + 'two'

Operation steps:

  1. Creates a temporary string in memory
  2. The concatenated string “onetwo” is assigned to the temporary string
  3. The temporary string is concatenated to the current value of STR
  4. The result is assigned to STR

STR = STR + ‘one’ + ‘two’

This avoids creating temporary strings, but is it really efficient?



The actual test found errors in the book

I actually loop through this multiple times and find that STR += “one” + “two” is faster. Is the test code I used inaccurate?

On paper come Zhongjue shallow, know this matter to practice! [insert picture description here] (img – blog. Csdnimg. Cn / 20210520235… =300×300)

Test code:

(function () {
    var str = ' ';
    console.time('str += "one" + "two"')
    for (var i = 0; i < 10000000; i++) {
        str += 'one' + 'two';
    }
    console.timeEnd('str += "one" + "two"')
})();
(function () {
    var atr = ' ';
    console.time('atr = atr + "one" + "two"')
    for (var i = 0; i < 10000000; i++) {
        atr = atr + 'one' + 'two';
    }
    console.timeEnd('atr = atr + "one" + "two"')
})()
Copy the code

Chrome:

Internet Explorer:

Firefox:

Efficiency tests related to addition (+) and addition (+=)

Code:

(function () {
    var atr = ' ';
    console.time("atr = atr + 'one'")
    for (var i = 0; i < 10000000; i++) {
        atr = atr + 'one';
    }
    console.timeEnd("atr = atr + 'one'")
})();
(function () {
    var str = ' ';
    console.time("str += 'one'")
    for (var i = 0; i < 10000000; i++) {
        str += 'one' ;
    }
    console.timeEnd("str += 'one'")
})();
Copy the code

Chrome:

Firefox Firefox:

In Firefox, the += operator performs much better than the + operator if the number of loops is large

Array.join()

Array item merge, the contents of the book and MY actual test seems not quite the same

In my tests, I found a 1000000 ‘one’ string consisting of [‘one’, ‘one’,……] The + operator uses join to merge the array, which is much more efficient than the + operator using 1,000,000 ‘ones’

Test code:

(function () {
    var atr = ' ';
    console.time("atr = atr + 'one'")
    for (var i = 0; i < 1000000; i++) {
        atr = atr + 'one';
    }
    console.timeEnd("atr = atr + 'one'")
})();
(function () {
    var str = ' ', arr = [];
    console.time("arr.join")
    for(var i = 0; i < 1000000; i++){
        arr.push('one');
    }
    str = arr.join("");
    console.timeEnd("arr.join")
})();
Copy the code

Chrome:

Firefox:

Internet Explorer:

Times have changed

It seems like times have changed, and after testing, after a lot of cyclic testing,Join is the fastest way to convert an Array to an Array. The + operation is almost as slow as concat

I don’t know if the browser engine has been optimized or if array.join () has improved performance

If you don’t want to believe it, you can test it for yourself

If you think my test code is not correct, you can tell me immediately, because I am also suspicious

(function () {
    let str = ' ';
    console.time("+ operation")
    for (let i = 0; i < 1000000; i++) {
        str = str + 'one';
    }
    console.timeEnd("+ operation")
})();
(function () {
    let str = ' ';
    console.time("+= operator")
    for (let i = 0; i < 1000000; i++) {
        str += 'one';
    }
    console.timeEnd("+= operator")
})(); 
(function () {
    let str = ' ', arr = [];
    
    for(let i = 0; i < 1000000; i++){
        arr.push('one');
    }
    console.time("Arr.join")
    str = arr.join("");
    console.timeEnd("Arr.join")
})();   
(function () {
    let str = ' ';
    console.time("Concat")
    for(let i = 0; i < 1000000; i++){
        str = str.concat('one')}console.timeEnd("Concat")
})();  
Copy the code






Regular expressions

Here’s the regular expression manual: gitee.com/thinkyoung/…

Of course, this manual is not the original, but it is more comprehensive

How regular expressions work

Processing steps:

  1. compile

    • A regular expression object is created, the browser validates the expression, and then converts it into a native code program.
    • If you assign the regular object to a variable, you can avoid repeating this step
  2. Set the starting position

    • When the regular class comes into use, it first determines where the target string search begins. This is specified either by the starting character of the string or by the lastIndex property of the regular expression
    • If the attempt fails, return from step 4 below, and the starting position is changed to the character next to the last match
  3. Matches each regular expression character

    • Once you know where the regular expression starts from step 2 above, you can examine the text and regular expression patterns one by one
    • When a particular character match fails, the regular expression tries to trace back to the previous attempt and then tries other possible paths
  4. The match succeeded or failed

    • If an exact match is found at the current position of the string, a match is declared
    • If none of the possible paths are matched, the regular expression engine returns the second step, starting with the next character
    • If all characters in the string have gone through this process and have not yet succeeded, the match is declared failed

Branching and backtracking processes

Example:

let str = "hello people, happy life";
let reg = /h(ello|appy) life/g;   
console.log(str.match(reg))
Copy the code

The matching result is:

Analyze:

The goal of the regular expression reg: to match “Hello Life” or “Happy Life” in the string

Start matching:

  1. First look for the letter H. The string STR starts with h. Lucky
  2. Next (ello | appy) characteristics of the standard group branch statements give you two choices, find “ello” or “appy. Follow the principle from left to right, look for the current h to see if there is ello, there is hello, lucky
  3. “Hello” is followed by “people” in the target string STR, so this path is blocked. Hurry back to the previous select branch and search for “appy”.
  4. If the current initial h is not “appy”, then the selection branch fails. Start again!
  5. The search continues from the second letter of the string and finally finds the second h after the “,” comma
  6. Select branch “ello” or “appy”, “ello” search failed, backtrack to select branch, found “appy” match, get “happy”
  7. D =( ̄▽ ̄*)b good, match successfully, re gives the result “Happy life”




Improve regular expression efficiency

Because regular expression performance can make a big difference depending on the target string, there is no easy way to test regular expression performance

Can only generically “high Performance JavaScript” book method to say:

  • Avoid backtracking runaway

    • Backtracking runaway occurs mainly where the regular expression should match quickly, because some special string matching action causes it to run slowly or even crash.
    • Avoidance: Make adjacent characters mutually exclusive, avoid nested quantifiers matching the same part of the same string multiple times, and remove unnecessary backtracking by reusing prechecked groups of atoms
  • Focus on how to make matches fail faster

    • The reason for the slow regular expression is usually that the process of matching failure is slow, which reduces the number of backtracking times of matching failures
  • Start with a simple, required character

    • Ideally, the start tag of a regular expression should test and rule out obvious mismatches as quickly as possible
  • Use the quantifier pattern so that the characters following them are mutually exclusive

  • Reduce the number of branches, narrow scope of branch branch using the vertical bar | can through the use of the character set and option component to reduce the demand for branch

    Replace the former After the replacement
    cat|bat [cb]at
    red|read rea? d
    red|raw r(? :ed|aw)
    (.|\r|\n) [\s\S]
  • Use non-capture groups

  • Capture only the text of interest to reduce post-processing

  • Expose the necessary characters

  • Use appropriate quantifiers

  • Assign regular expressions to variables and reuse them

  • Breaking complex regular expressions into simple pieces