Template literals
Background knowledge
1. The source code is generated ast tree by scanner and Parser + pre-parser in V8. After parser analysis, bytecode files are generated by interpreter.
2, v8 compilation stage has the lexical analysis (scanner), in the source scanner scanner class Initialize start to parse the first lexical.
3. After V8 lexical parsing, acquire tokens, and then parse ast tree according to tokens.
4. The AST tree is converted to BytecodeGenerator.
Lexical analysis stage -scanner
1. In the lexical analysis stage, the scanner scans the single morphology in the JS string (corresponding to ScanSingleToken). In ScanSingleToken, it determines that the single morphology is equal to TEMPLATE_SPAN. Call the ScanTemplateSpan() method with the corresponding code:
V8_INLINE Token::Value Scanner::ScanSingleToken() { Token::Value token; do { next().location.beg_pos = source_pos(); if (V8_LIKELY(static_cast<unsigned>(c0_) <= kMaxAscii)) { token = one_char_tokens[c0_]; Switch (token) {case token ::LPAREN: //... case Token::TEMPLATE_SPAN: Advance(); return ScanTemplateSpan(); // omit... default: UNREACHABLE(); } } // Continue scanning for tokens as long as we're just skipping whitespace. } while (token == Token::WHITESPACE); return token; }Copy the code
In ScanTemplateSpan approach, by scanning a single character, matching the TEMPLATE_TAIL (` |} characters) and TEMPLATE_SPAN (${|} characters), The position of the characters in templateLiteral is noted, and special characters such as \ and \n are processed.
Token::Value Scanner::ScanTemplateSpan() { // When scanning a TemplateSpan, we are looking for the following construct: // TEMPLATE_SPAN :: // ` LiteralChars* ${ // | } LiteralChars* ${ // // TEMPLATE_TAIL :: // ` LiteralChars* ` // | } LiteralChar* ` // // A TEMPLATE_SPAN should always be followed by an Expression, while a // TEMPLATE_TAIL terminates a TemplateLiteral and does not need to be // followed by an Expression. // These scoped helpers save and restore the original error state, so that we // can specially treat invalid escape sequences in templates (which are // handled by the parser). ErrorState scanner_error_state(&scanner_error_, &scanner_error_location_); ErrorState octal_error_state(&octal_message_, &octal_pos_); Token::Value result = Token::TEMPLATE_SPAN; next().literal_chars.Start(); next().raw_literal_chars.Start(); const bool capture_raw = true; while (true) { base::uc32 c = c0_; if (c == '`') { Advance(); // Consume '`' result = Token::TEMPLATE_TAIL; break; } else if (c == '$' && Peek() == '{') { Advance(); // Consume '$' Advance(); // Consume '{' break; } else if (c == '\\') { Advance(); // Consume '\\' DCHECK(! unibrow::IsLineTerminator(kEndOfInput)); if (capture_raw) AddRawLiteralChar('\\'); if (unibrow::IsLineTerminator(c0_)) { // The TV of LineContinuation :: \ LineTerminatorSequence is the empty // code unit sequence. base::uc32 lastChar = c0_; Advance(); if (lastChar == '\r') { // Also skip \n. if (c0_ == '\n') Advance(); lastChar = '\n'; } if (capture_raw) AddRawLiteralChar(lastChar); } else { bool success = ScanEscape<capture_raw>(); USE(success); DCHECK_EQ(! success, has_error()); // For templates, invalid escape sequence checking is handled in the // parser. scanner_error_state.MoveErrorTo(next_); octal_error_state.MoveErrorTo(next_); } } else if (c == kEndOfInput) { // Unterminated template literal break; } else { Advance(); // Consume c. // The TRV of LineTerminatorSequence :: <CR> is the CV 0x000A. // The TRV of LineTerminatorSequence :: <CR><LF> is the sequence // consisting of the CV 0x000A. if (c == '\r') { if (c0_ == '\n') Advance(); // Consume '\n' c = '\n'; } if (capture_raw) AddRawLiteralChar(c); AddLiteralChar(c); } } next().location.end_pos = source_pos(); next().token = result; return result; }Copy the code
Here are the steps to get templateLiteral tokens.
Grammar analysis stage – Sparser
The parser class uses an overloading, which includes templateLiteral:
Parser::TemplateLiteralState Parser::OpenTemplateLiteral(int pos) { return zone()->New<TemplateLiteral>(zone(), pos); } void Parser::AddTemplateSpan(TemplateLiteralState* state, bool should_cook, bool tail) { int end = scanner()->location().end_pos - (tail ? 1, 2); const AstRawString* raw = scanner()->CurrentRawSymbol(ast_value_factory()); if (should_cook) { const AstRawString* cooked = scanner()->CurrentSymbol(ast_value_factory()); (*state)->AddTemplateSpan(cooked, raw, end, zone()); } else { (*state)->AddTemplateSpan(nullptr, raw, end, zone()); } } void Parser::AddTemplateExpression(TemplateLiteralState* state, Expression* expression) { (*state)->AddExpression(expression, zone()); } Expression* Parser::CloseTemplateLiteral(TemplateLiteralState* state, int start, Expression* tag) { TemplateLiteral* lit = *state; int pos = lit->position(); const ZonePtrList<const AstRawString>* cooked_strings = lit->cooked(); const ZonePtrList<const AstRawString>* raw_strings = lit->raw(); const ZonePtrList<Expression>* expressions = lit->expressions(); DCHECK_EQ(cooked_strings->length(), raw_strings->length()); DCHECK_EQ(cooked_strings->length(), expressions->length() + 1); if (! tag) { if (cooked_strings->length() == 1) { return factory()->NewStringLiteral(cooked_strings->first(), pos); } return factory()->NewTemplateLiteral(cooked_strings, expressions, pos); } else { // GetTemplateObject Expression* template_object = factory()->NewGetTemplateObject(cooked_strings, raw_strings, pos); // Call TagFn ScopedPtrList<Expression> call_args(pointer_buffer()); call_args.Add(template_object); call_args.AddAll(expressions->ToConstVector()); return factory()->NewTaggedTemplate(tag, call_args, pos); }}Copy the code
BytecodeGenerator -BytecodeGenerator
AstVisitor abstracts the syntax tree access class (designed based on the Visitor design pattern), which uses a class overload to parse the TemplateLiteral class (VisitTemplateLiteral), and eventually generates a bytecode iterator, There are two variables to note here. One is the string placeholder array substitutions, which is a variable in TemplateLiteral, and one is string_parts, which is the unsubstitutable part of TemplateLiteral.
void BytecodeGenerator::VisitTemplateLiteral(TemplateLiteral* expr) { const ZonePtrList<const AstRawString>& parts = *expr->string_parts(); const ZonePtrList<Expression>& substitutions = *expr->substitutions(); // Template strings with no substitutions are turned into StringLiterals. DCHECK_GT(substitutions.length(), 0); DCHECK_EQ(parts.length(), substitutions.length() + 1); // Generate string concatenation // TODO(caitp): Don't generate feedback slot if it's not used --- introduce // a simple, concise, reusable mechanism to lazily create reusable slots. FeedbackSlot slot = feedback_spec()->AddBinaryOpICSlot(); Register last_part = register_allocator()->NewRegister(); bool last_part_valid = false; builder()->SetExpressionPosition(expr); for (int i = 0; i < substitutions.length(); ++i) { if (i ! = 0) { builder()->StoreAccumulatorInRegister(last_part); last_part_valid = true; } if (! parts[i]->IsEmpty()) { builder()->LoadLiteral(parts[i]); if (last_part_valid) { builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot)); } builder()->StoreAccumulatorInRegister(last_part); last_part_valid = true; } TypeHint type_hint = VisitForAccumulatorValue(substitutions[i]); if (type_hint ! = TypeHint::kString) { builder()->ToString(); } if (last_part_valid) { builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot)); } last_part_valid = false; } if (! parts.last()->IsEmpty()) { builder()->StoreAccumulatorInRegister(last_part); builder()->LoadLiteral(parts.last()); builder()->BinaryOperation(Token::ADD, last_part, feedback_index(slot)); }}Copy the code
The test case
At this point, V8 is done, and the following code can also be found in the V8 test case (the Test folder in the V8 source code).
TEST(TemplateLiterals) { InitializedIgnitionHandleScope scope; BytecodeExpectationsPrinter printer(CcTest::isolate()); const char* snippets[] = { "var a = 1; \n" "var b = 2; \n" "return `${a}${b}string`; \n", "var a = 1; \n" "var b = 2; \n" "return `string${a}${b}`; \n", "var a = 1; \n" "var b = 2; \n" "return `${a}string${b}`; \n", "var a = 1; \n" "var b = 2; \n" "return `foo${a}bar${b}baz${1}`; \n", "var a = 1; \n" "var b = 2; \n" "return `${a}string` + `string${b}`; \n", "var a = 1; \n" "var b = 2; \n" "function foo(a, b) { }; \n" "return `string${foo(a, b)}${a}${b}`; \n", }; CHECK(CompareTexts(BuildActual(printer, snippets), LoadGolden("TemplateLiterals.golden"))); }Copy the code
BytecodeGenerator Generates the final bytecode: GenerateBytecodeBody
Refer to the link
1. Use JavaScript to take you through the V8 engine parsing string www.cnblogs.com/QH-Jimmy/p/…
2, JavaScript engine (V8) works segmentfault.com/a/119000002…