This is the 12th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021.

preface

ECMAScript 6.0, or ES6 for short, is a new generation of JavaScript standard. It was released in June 2015 and is officially known as ECMAScript 2015 Standard. General, generally refers to the standard after 5.1 version, covering ES2015, ES2016, ES2017, ES2018, ES2019, ES2020, ES2021 and so on

Let’s learn about enhanced strings.

String traversal problem

As you all know, strings can use index values to retrieve corresponding characters.

var text = "abc";
for(var i = 0; i < text.length; i++){
    console.log(text[i]);
}
// a
// b
// c
Copy the code

At first glance, this looks fine, but remember that JS uses UTF-16 encoding to store characters. Characters with code points greater than 0xFFFF are four bytes long, and the corresponding length attribute is 2.

"𢂘".length / / 2
Copy the code

Let’s try the for loop again

var text = "𢂘";
for(var i = 0; i < text.length; i++){
    console.log(text[i]);
}
/ / �
/ / �
Copy the code

Nani, what the hell is this. References to the Unicode table U+D800 through U+DFFF are not printable, the display is �.

"𢂘".charAt(0) // '\uD848'
"𢂘".charAt(1) // '\uDC98'
Copy the code

Why charAt with code point 0x22098 is ‘\uD848’ and ‘\uDC98’ will be discussed in a separate article.

ES6 String traversal

Fortunately, ES6 takes this into account and implements symbol. iterator on its prototype, allowing it to iterate over its values using for of.

var text = "𢂘";
for(let v of text){
    console.log(v);
}
/ / "𢂘"
Copy the code

As you can see, this gives you the exact value, not the garbled value. Ha ha, 666.

Do not use for of traversal

We know that if the code point is greater than the value of the next index value 0xFFFF, it is not available. Let’s just skip ahead.

var text = "𢂘 a 𠮷 people";
var code;
var skip;
for (var i = 0; i < text.length; i++) {
  if (skip) {
    skip = false;
    continue;
  }
  code = text.codePointAt(i);
  console.log(String.fromCodePoint(+`0x${code.toString(16)}`));
  if (code > 0xffff) {
    skip = true; }}/ / 𢂘
// a
/ / 𠮷
/ /
Copy the code

That doesn’t look good. Let’s encapsulate it.

To encapsulate, you need to do something extra with the index value.

function strforEach(str, callback) {
  var code;
  var skip;
  var index = 0;
  for (var i = 0; i < str.length; i++) {
    if (skip) {
      skip = false;
      continue;
    }
    code = str.codePointAt(i);
    callback(String.fromCodePoint(+`0x${code.toString(16)}`), index, str);
    index++

    if (code > 0xffff) {
      skip = true; }}}Copy the code

Test the results:

var text = "𢂘 a 𠮷 people";
strforEach(text, function (ch, index, str) {
  console.log(ch, index, str);
})
// 𢂘 0 𢂘a𠮷
// a 1 𢂘a𠮷
// 𠮷 2 𢂘
3 𢂘a𠮷
Copy the code

Of course, it’s dangerous to change the string itself while iterating.

The problem link

Can I use object.values () to get all the values of a string?

Object.values("𢂘 a 𠮷 people")
(6) ['\uD848'.'\uDC98'.'a'.'\uD842'.'\uDFB7'.'people']
Copy the code

The answer is clearly not.

Can I use an extended operation to get all the values

Of course you can, because strings implement symbol. iterator

[..."𢂘 a 𠮷 people"]
(4) ['𢂘'.'a'.'𠮷'.'people']
Copy the code

That’s convenient.

summary

So the easiest way, of course, is to extend the operators. Did you harvest anything today?

The following

Let’s take a look at the mystery

"𢂘".charAt(0) // '\uD848' 
"𢂘".charAt(1) // '\uDC98'
Copy the code