100 lines of code to write your own template engine

A diagram illustrates how the Ejs template engine works

The diagram above Outlines the principles of a simple template engine (in this case, EJS). This article describes how a simple template engine works. Contains the key steps of implementation and the thinking behind them.

This is basically how the template engine works, but the ideas are universal. If you look at vue’s template compiler source code, you can apply these ideas and methods as well.

Basic API Design

We will implement a simplified version of EJS that supports these tags:

<% script %> – Script execution. Generally used in control statements, do not output values such as
```
<% if (user) { %>
  <div>some thing</div>
<%} % >
Copy the code
```
<%= expression %> – Prints the value of the expression, but escapes HTML:
```
<title>{%= title %}</title>
Copy the code
```
<% -expression %> – Same as <%= expr %> except that HTML is not escaped
<%% and %%> – indicate label escape, such as <%% is printed as <%
<%# comment %> – No content output

Here is a complete template example, based on which the rest of the article will be explained:

<html>
  <head><% = title% ></head>
  <body>
    <% %Escape % % >
    <% #Here is the comment %>
    <% - before% >
    <% if (show) { %>
      <div>root</div>
    <%} % >
  </body>
</html>
Copy the code

Basic API Design

We put the Template parsing and rendering logic into a Template class with the following basic interface:

export default class Template {
  public template: string;
  private tokens: string[] = [];
  private source: string = "";
  privatestate? : State;privatefn? :Function;

  public constructor(template: string) {
    this.template = template;
  }

  /** * Template compilation */
  public compile() {
    this.parseTemplateText();
    this.transformTokens();
    this.wrapit();
  }

  /** * render method, the user specifies an object to render the string */
  public render(local: object) { }


  / token resolution * * * * to < % if (codintion) {% > * resolved as a token arrays, for example [' < % ', 'the if (condition) {',' % > '] * /
  private parseTemplateText() {}
  /** * convert Token to Javascript statement */
  private transformTokens() {}
  /** * Encapsulates the Javascript statement converted from the previous step into a render method */
  private wrapit() {}
}
Copy the code

Token parsing

The first step is to parse all the start tags and end tags. We expect the parsing result to look like this:

[
  "\n\n "."< % ="." title "."% >"."\n \n "."< % %"."Escape"."% % >"."\n "."< % #"."Here's a comment."."% >"."\n "."< % -"." before "."% >"."\n "."< %"." if (show) { "."% >"."\n 
      
       root
      
\n "."< %"."}"."% >"."\n \n\n"
]
Copy the code

Because our template engine syntax is very simple, we don’t need to parse into an abstract syntax tree (AST) at all. Labels can be extracted directly through regular expressions.

Start by defining regular expressions that match all of our supported tags:

// <%% %%> is used for escape
/ / < % script
// <%= Outputs script values
// <%- Prints the script value, unescape
/ / < % # comments
// %> End tag
const REGEXP = | / (< % % % % > < % = | | - | < < % # % | | < % % >) /;
Copy the code

Use regular expressions to match strings one by one and break them up. The code is also simple:

  parseTemplateText() {
    let str = this.template;
    const arr = this.tokens;
    // The exec method retrieves the matching position, or returns null if the match fails
    let res = REGEXP.exec(str);
    let index;

    while (res) {
      index = res.index;
      // Prefix string
      if(index ! = =0) {
        arr.push(str.substring(0, index));
        str = str.slice(index);
      }

      arr.push(res[0]);
      // Truncate the string to continue the match
      str = str.slice(res[0].length);
      res = REGEXP.exec(str);
    }

    if(str) { arr.push(str); }}Copy the code

Simple grammar check

Ok, once you’ve parsed out the tags, you’re ready to convert them into ‘render’ functions.

First, do a simple syntax check to see if the tag is closed:

const start = "< %";           // Start tag
const end = "% >";             // End tag
const escpStart = "< % %";      // Start label escape
const escpEnd = "% % >";        // End label escape
const escpoutStart = "< % =";   // Escaped expression output
const unescpoutStart = "< % -"; // Unescaped expression output
const comtStart = "< % #";      / / comment

if(tok.includes(start) && ! tok.includes(escpStart)) { closing =this.tokens[idx + 2];
  if (closing == null| |! closing.includes(end)) {throw new Error(`${tok}The corresponding closing label 'was not found); }}Copy the code

conversion

Now start iterating through the token. We can use a finite-state machine (FSM) to describe the logic of the transformation.

A state machine is a mathematical model that represents a finite number of states and behaviors such as transitions and actions between these states. In simple terms, a finite state machine consists of a set of states, an initial state, inputs, and transition functions based on the inputs and the existing state to the next state. It has three characteristics:

The total number of states is finite.
In one state at any one time.
It goes from one state to another under certain conditions

For a bit of analysis, our template engine state transition diagram looks like this:

The following states can be extracted from the figure above:

enum State {
  EVAL,    // Script execution
  ESCAPED, // Expression output
  RAW,     // Expression output is not escaped
  COMMENT, / / comment
  LITERAL  // A literal output
}
Copy the code

Ok, now start iterating through tokens:

this.tokens.forEach((tok, idx) = > {
  // ...
  switch (tok) {

    /** * Label recognition */

    case start:
      // The script starts
      this.state = State.EVAL;
      break;
    case escpoutStart:
      // Escape output
      this.state = State.ESCAPED;
      break;
    case unescpoutStart:
      // Non-escape output
      this.state = State.RAW;
      break;
    case comtStart:
      / / comment
      this.state = State.COMMENT;
      break;
    case escpStart:
      // Label escape
      this.state = State.LITERAL;
      this.source += `; __append('<%'); \n`;
      break;
    case escpEnd:
      this.state = State.LITERAL;
      this.source += `; __append('%>'); \n`;
      break;
    case end:
      // Restore the initial state
      this.state = undefined;
      break;
    default:

      /** * convert output */

      if (this.state ! =null) {
        switch (this.state) {
          case State.EVAL:
            / / code
            this.source += `;${tok}\n`;
            break;
          case State.ESCAPED:
            // stripSemi removes extra semicolons
            this.source += `; __append(escapeFn(${stripSemi(tok)})); \n`;
            break;
          case State.RAW:
            this.source += `; __append(${stripSemi(tok)}); \n`;
            break;
          case State.LITERAL:
            // Because we put strings in single quotes, transformString converts single quotes, newlines, and escapes from tok
            this.source += `; __append('${transformString(tok)}'); \n`;
            break;
          case State.COMMENT:
            // Do nothing
            break; }}else {
        / / literal
        this.source += `; __append('${transformString(tok)}'); \n`; }}});Copy the code

After the above transformation, we can get the result like this:

; __append('\n\n '); ; __append(escapeFn( title )); ; __append('\n \n '); ; __append('< %'); ; __append('escape'); ; __append('% >'); ; __append('\n '); ; __append('\n '); ; __append( before ); ; __append('\n ');
; if(show) { ; __append('\n 
      
       root
      
\n ');
; }
;__append('\n \n\n');
Copy the code

The last step is to generate the function

Now we wrap the transformation result in a function:

wrapit() {
    this.source = `\
const __out = [];
const __append = __out.push.bind(__out);
with(local||{}) {
The ${this.source}} return __out.join(''); \ `;
    this.fn = new Function("local"."escapeFn".this.source);
  }
Copy the code

The with statement is used to wrap the code above so that the local object can access the qualified prefix.

The render method is simple, calling the function wrapped above directly:

  render(local: object) {
    return this.fn.call(null, local, escape);
  }
Copy the code

run

const temp = new Template(` < HTML > < head > < % = title % > < / head > < body > < % % escaping % % > < % # are comments here % > < % - before % > < % if (show) {% > < div > root < / div > < %} %>   `);

temp.compile();
temp.render({ show: true.title: "hello".before: "<div>xx</div>" })
// <html>
// hello
// 
//     <% 转义 %>
//
// 
      
       xx
      
//
// 
      
       root
      
//
// 
// </html>
Copy the code

You can run the complete code in CodeSandbox:

conclusion

In fact, this paper was inspired by the -super-tiny-Compiler and implemented a minimalist template engine. In fact, the template engine is also a compiler in nature. It can be learned from the above that there are three steps to compile a template engine:

Parsing parses template code into an abstract representation. Complex compilers have Lexical Analysis and Syntactic Analysis.

Lexical parsing, the process of parsing template content into tokens above can be considered as’ lexical parsing ‘. It splits the source code into token arrays. Tokens are small units representing independent ‘grammar fragments’.

Syntax parsing. The Syntax parser takes token arrays and reformats them as Abstract Syntax trees (AST). Abstract Syntax trees can be used to describe Syntax units and the relationships between units. Syntax problems can be found in the parsing phase.

(Photo credit: ruslanspivak.com/lsbasi-part…)

The template engine described in this article does not need an AST intermediate representation because the syntax is too simple. Convert directly on Tokens
The transformation transforms the representation abstracted from the previous step into what the compiler expects. For example, the template engine will convert the statement to the corresponding language. Sophisticated compilers’ transform ‘based on the AST, that is,’ add, delete, modify ‘the AST. The nodes of the AST are typically traversed/accessed in conjunction with the Visitors pattern
Code generation transforms the transformed abstract representation into new code. A template engine, for example, wraps the last step into a rendering function. Sophisticated compilers convert the AST into object code

The compiler stuff is really interesting, and I’ll have a chance to talk about how to write the Babel plug-in later.

extension

Ejs source code
the-super-tiny-compiler
Let’s Build A Simple Interpreter. Part 7: Abstract Syntax Trees

100 lines of code to write your own template engine

Basic API Design

Token parsing

Simple grammar check

conversion

The last step is to generate the function

run

conclusion

extension

Related Posts

Redux Fundamentals

E325: ATTENTION error resolution for Vim editor use: Make Vim edit file prompt E325: ATTENTION error resolution

Vue projects are compiled and deployed in a solution that is not the site root