This is the 12th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021.
preface
ECMAScript 6.0, or ES6 for short, is a new generation of JavaScript standard. It was released in June 2015 and is officially known as ECMAScript 2015 Standard. General, generally refers to the standard after 5.1 version, covering ES2015, ES2016, ES2017, ES2018, ES2019, ES2020, ES2021 and so on
Let’s learn about enhanced strings.
String traversal problem
As you all know, strings can use index values to retrieve corresponding characters.
var text = "abc";
for(var i = 0; i < text.length; i++){
console.log(text[i]);
}
// a
// b
// c
Copy the code
At first glance, this looks fine, but remember that JS uses UTF-16 encoding to store characters. Characters with code points greater than 0xFFFF are four bytes long, and the corresponding length attribute is 2.
"𢂘".length / / 2
Copy the code
Let’s try the for loop again
var text = "𢂘";
for(var i = 0; i < text.length; i++){
console.log(text[i]);
}
/ / �
/ / �
Copy the code
Nani, what the hell is this. References to the Unicode table U+D800 through U+DFFF are not printable, the display is �.
"𢂘".charAt(0) // '\uD848'
"𢂘".charAt(1) // '\uDC98'
Copy the code
Why charAt with code point 0x22098 is ‘\uD848’ and ‘\uDC98’ will be discussed in a separate article.
ES6 String traversal
Fortunately, ES6 takes this into account and implements symbol. iterator on its prototype, allowing it to iterate over its values using for of.
var text = "𢂘";
for(let v of text){
console.log(v);
}
/ / "𢂘"
Copy the code
As you can see, this gives you the exact value, not the garbled value. Ha ha, 666.
Do not use for of traversal
We know that if the code point is greater than the value of the next index value 0xFFFF, it is not available. Let’s just skip ahead.
var text = "𢂘 a 𠮷 people";
var code;
var skip;
for (var i = 0; i < text.length; i++) {
if (skip) {
skip = false;
continue;
}
code = text.codePointAt(i);
console.log(String.fromCodePoint(+`0x${code.toString(16)}`));
if (code > 0xffff) {
skip = true; }}/ / 𢂘
// a
/ / 𠮷
/ /
Copy the code
That doesn’t look good. Let’s encapsulate it.
To encapsulate, you need to do something extra with the index value.
function strforEach(str, callback) {
var code;
var skip;
var index = 0;
for (var i = 0; i < str.length; i++) {
if (skip) {
skip = false;
continue;
}
code = str.codePointAt(i);
callback(String.fromCodePoint(+`0x${code.toString(16)}`), index, str);
index++
if (code > 0xffff) {
skip = true; }}}Copy the code
Test the results:
var text = "𢂘 a 𠮷 people";
strforEach(text, function (ch, index, str) {
console.log(ch, index, str);
})
// 𢂘 0 𢂘a𠮷
// a 1 𢂘a𠮷
// 𠮷 2 𢂘
3 𢂘a𠮷
Copy the code
Of course, it’s dangerous to change the string itself while iterating.
The problem link
Can I use object.values () to get all the values of a string?
Object.values("𢂘 a 𠮷 people")
(6) ['\uD848'.'\uDC98'.'a'.'\uD842'.'\uDFB7'.'people']
Copy the code
The answer is clearly not.
Can I use an extended operation to get all the values
Of course you can, because strings implement symbol. iterator
[..."𢂘 a 𠮷 people"]
(4) ['𢂘'.'a'.'𠮷'.'people']
Copy the code
That’s convenient.
summary
So the easiest way, of course, is to extend the operators. Did you harvest anything today?
The following
Let’s take a look at the mystery
"𢂘".charAt(0) // '\uD848'
"𢂘".charAt(1) // '\uDC98'
Copy the code