Recently, I received a demand in daily brick moving, which needs to implement a code editor on the web page. The editor supports the govaluate syntax (govaluate syntax introduction please poke here), and needs to have the most basic interaction effects of the code editor, such as code prompt, keyword highlighting, code error capture, hover prompt, automatic formatting, etc. We know that the code editor on the web will use the Monaco Editor, which already has built-in support for major programming languages such as JS, Java, Go, etc. However, the language of this requirement is a go library, which Monaco does not support, so we need to use Monaco’s self-defined language ability to complete this requirement.
After research and a code editor similar to our own implementation language, I learned about antLR, a library that I can use to customize the Monaco language.
About the antlr
ANTLR (Full name: ANother Tool for Language Recognition (ANother Tool for Language Recognition) is a powerful automatic parser generation Tool written in Java Language. It was introduced in 1989 by Dr. Terence Parr et al from university of San Francisco. The iteration is now the fourth generation. So it’s called Antlr4. The tool itself is a Java language tool, but the resulting parser can be in mainstream programming languages including JS and TS, so Antlr4 is basically the most widely used automatic parser generation tool.
Here is a more detailed article on ANTLR and its usage, and interested students can follow the practice of compiling technology on the front end (II) – ANTLR and its application
Pay attention to
In this article, I will not talk much about the use of the Monaco Editor. Instead, I will show you how ANTLR implements a custom Monaco language. Monaco editor website
Technology selection
Use react+ts+antlr4ts+ React-Monaco-Editor.
Initialize project installation dependencies
npm i react-monaco-editor
npm i antlr4ts
npm i antlr4ts-cli -D
Copy the code
The use of Monaco editor
import React from 'react';
import './App.css';
import MonacoEditor from 'react-monaco-editor';
function App() :JSX.Element {
return (
<div className="App">
<MonacoEditor
width={800}
height={600}
options={{
fontSize: 20,}}language="javascript"
theme="vs-dark"
/>
</div>
);
}
export default App;
Copy the code
To use the Monaco Editor, you need to install the Monaco – Editor-webpack-plugin.
Write a G4 file to generate a parser
Because it’s a small demo, we’re going to use the simplest syntax of addition, subtraction, multiplication, and division, which looks something like this in a G4 file
See the above article on ANTLR for details on how to write the G4 file. In short, I defined the morphology and grammar. The morphology is addition, subtraction, multiplication and division equal to parentheses and numbers respectively. Syntax for parenthesis syntax, addition and subtraction, multiplication and division. After writing the G4 file, you need to use ANTlr4ts – CLI to generate the parser
npx antlr4ts -visitor src/parser/calc.g4
Copy the code
After running this command, you can see that several files have been generated
You can check the contents of these files. And you get a sense of what it does, right
Implement keyword highlighting
Monaco implements highlighting using the setTokensProvider API. We simply need to obtain the location of each keyword in the text and assemble it into the data Monaco wants to achieve. So all you need to do is use a lexical analyzer. For our computed expression syntax we just highlight numbers and operators.
Implement TokenProvider class
Start by declaring a class that Monaco needs to highlight
import * as monaco from 'monaco-editor/esm/vs/editor/editor.api';
function getTokens(input: string) {
return[]}function tokenForLine(input: string) {
const tokens = getTokens(input);
return { tokens, endState: new State() };
}
class State implements monaco.languages.IState {
clone(): monaco.languages.IState {
return new State();
}
equals(other: State): boolean {
return true; }}export class TokensProviders implements monaco.languages.TokensProvider {
tokenize(line: string.state: State): monaco.languages.ILineTokens {
return tokenForLine(line);
}
getInitialState(): monaco.languages.IState {
return newState(); }}Copy the code
Our main analysis logic is in the getTokens function. We need a return format reference document IToken. First, we need to analyze which positions in the text we send in are configured with the morphology. We use the calcLexer class to get the token stream of text.
import { CharStreams } from 'antlr4ts';
import { calcLexer } from '.. /parser/calcLexer';
// Initialize the lexer
const chars = CharStreams.fromString(input);
const lexer = new calcLexer(chars);
lexer.removeErrorListeners();
// Get the token stream
const tokens = lexer.getAllTokens();
console.log(tokens)
Copy the code
Let’s say 1+1=2 in the editor and see what it prints out
You can see that it prints an array of tokens, so let’s click on the first token and see what’s inside
He’s parsed out all the lexical positions that we put in and his type, which is an index and needs to be converted
const type = lexer.ruleNames[token.type - 1];
Copy the code
That gives us the first word of type number and since our addition and subtraction etc are all operators, we need to convert them all to the same type and pass it to Monaco
export const TokenMap: Record<string.string> = {
ADD: 'operator'.SUB: 'operator'.DIV: 'operator'.MUL: 'operator'.EQUAL: 'operator'.OpenParen: 'operator'.CloseParen: 'operator'.NUMBER: 'keyword'.UnexpectedCharacter: ' '};Copy the code
We can also capture some morphology that we haven’t configured and turn it red
console errors = [];
lexer.addErrorListener({
syntaxError(_1, _2, _3, charPositionInLine: number){ errors.push(charPositionInLine); }});Copy the code
Finally, we configured a Monaco theme color to see the highlights
GetTokens complete code
function getTokens(input: string) {
const lexer = createLexer(input); // Initializing lexer encapsulates a function
// Catch lexical errors
const errors: number[] = [];
lexer.removeErrorListeners();
lexer.addErrorListener({
syntaxError(_1, _2, _3, charPositionInLine: number){ errors.push(charPositionInLine); }});// Get the token stream
const tokens = lexer.getAllTokens();
console.log(tokens);
const res: monaco.languages.IToken[] = tokens.map(token= > {
const type = lexer.ruleNames[token.type - 1];
const typeName = TokenMap[type] || TokenMap.UnexpectedCharacter;
return {
scopes: typeName,
startIndex: token.charPositionInLine,
};
});
// Add the caught errors to the res
errors.forEach(point= > res.push({ scopes: 'error'.startIndex: point }));
return res;
}
Copy the code
To this use of lexical analyzer keyword highlighting is complete. Of course, the actual requirements can be more flexible, such as the detection of parentheses after the word as a function.
Implement code hover prompt
Hover prompt we use a parser to implement. First, implement the Hover class as usual
Implement HoverProvider class
export class HoverProvider implements monaco.languages.HoverProvider {
provideHover(model: monaco.editor.IModel, position: monaco.Position) {
return {
contents: [],}; }}Copy the code
ProviderHover function to return the format of providerHover function to see here we use the parser to pass the text into AST number, and then through the corresponding method to get the mouse to what is the key word, first generate AST
export const getParser = (input: string) = > {
const lexer = createLexer(input); // Initializes the lexical parser
const tokenStream = new CommonTokenStream(lexer);
const parser = new calcParser(tokenStream);
parser.removeErrorListeners();
lexer.removeErrorListeners();
return parser;
};
export const getAST = (input: string) = > {
const parser = getParser(input);
const ast = parser.start();
return ast;
};
Copy the code
How to analyze the generated AST? We need the ParseTreeWalker provided by ANTlr4 to achieve this
import { ParseTreeWalker } from 'antlr4ts/tree/ParseTreeWalker';
ParseTreeWalker.DEFAULT.walk(finder, AST); AST / / analysis
Copy the code
So the Finder is just a callback class, and that class is the implementscalcListener for the interface. Whatever syntax he parses goes into the corresponding callback.
class HoverFinder implements calcListener { result? : {range: monaco.Range;
type: 'string'; name? :string;
};
private position: monaco.Position;
constructor(position: monaco.Position) {
this.position = position;
}
enterNumber(ctx: NumberContext) {
console.log(ctx); }}Copy the code
Let’s print CTX and see what it is
We can get the token via the start attribute and also get the location of the keyword. Use of Monaco. Range. ContainsPosition see if it matches.
const getRangeFromToken = (input: Token) = > {
const startLineNumber = input.line;
const startColumn = input.charPositionInLine + 1;
constlength = input.text? .length ||1;
return new monaco.Range(startLineNumber, startColumn, startLineNumber, startColumn + length);
};
enterNumber(ctx: NumberContext) {
if (!this.result) {
console.log(ctx);
const range = getRangeFromToken(ctx.start);
const matched = monaco.Range.containsPosition(range, this.position);
if (matched) {
this.result = {
range,
type: 'number'.name: ctx.start.text, }; }}}Copy the code
So we can see if the hover popover is triggered by result in the finder, so the complete code is
import { Token } from 'antlr4ts';
import { ParseTreeWalker } from 'antlr4ts/tree/ParseTreeWalker';
import * as monaco from 'monaco-editor/esm/vs/editor/editor.api';
import { getAST } from '.. /common';
import { calcListener } from '.. /parser/calcListener';
import { NumberContext } from '.. /parser/calcParser';
export class HoverProvider implements monaco.languages.HoverProvider {
provideHover(model: monaco.editor.IModel, position: monaco.Position) {
const content = model.getValue();
const AST = getAST(content || ' ');
const finder = new HoverFinder(position);
ParseTreeWalker.DEFAULT.walk(finder, AST); / / traverse the AST
const { result } = finder;
if (result.type === 'number') {
return {
contents: [{value: ` digital${result.name}`],},range: result.range,
};
}
return {
contents: [],}; }}const getRangeFromToken = (input: Token) = > {
const startLineNumber = input.line;
const startColumn = input.charPositionInLine + 1;
constlength = input.text? .length ||1;
return new monaco.Range(startLineNumber, startColumn, startLineNumber, startColumn + length);
};
class HoverFinder implements calcListener { result? : {range: monaco.Range;
type: string; name? :string;
};
private position: monaco.Position;
constructor(position: monaco.Position) {
this.position = position;
}
enterNumber(ctx: NumberContext) {
if (!this.result) {
console.log(ctx);
const range = getRangeFromToken(ctx.start);
const matched = monaco.Range.containsPosition(range, this.position);
if (matched) {
this.result = {
range,
type: 'number'.name: ctx.start.text, }; }}}visitErrorNode() {
// For the ts type to be correct}}Copy the code
The effect
Implement error capture
About code error capture using the Monaco. Editor. SetModelMarkers this API, we need to change the text of real-time detection error. We need to implement a validate function that is called when the text changes. This function returns an array representing the error location and content, and we use the setModelMarkers API to identify the error. We will implement this using syntax and lexical error detection. Specific code
import { CommonTokenStream, Token } from 'antlr4ts';
import * as monaco from 'monaco-editor/esm/vs/editor/editor.api';
import { createLexer } from '.. /common';
import { calcParser } from '.. /parser/calcParser';
const getPositionByToken = (token: Token) = > ({
startLineNumber: token.line,
startColumn: token.charPositionInLine + 1.endLineNumber: token.line,
endColumn: token.charPositionInLine + (token.text? .length ||0) + 1});export const validate = async (model: monaco.editor.IModel) => {
let content = ' ';
try {
content = model.getValue();
console.log(content);
} catch {
monaco.editor.setModelMarkers(model, 'ruleLint'[]);return;
}
if(! content.trim()) { monaco.editor.setModelMarkers(model,'ruleLint'[]);return;
}
const lexer = createLexer(content);
const tokenStream = new CommonTokenStream(lexer);
const parser = new calcParser(tokenStream);
lexer.removeErrorListeners();
parser.removeErrorListeners();
const errors: monaco.editor.IMarkerData[] = [];
// Collect lexical and grammatical errors
lexer.addErrorListener({
syntaxError(_1, _2, line, charPositionInLine, msg, _6) {
errors.push({
message: msg,
severity: monaco.MarkerSeverity.Error,
source: 'validator'.startLineNumber: line,
startColumn: charPositionInLine + 1.endLineNumber: line,
endColumn: charPositionInLine + 2.code: 'lexer'}); }}); parser.addErrorListener({syntaxError(_1, offendingSymbol, _3, _4, msg, _6) {
if (offendingSymbol) {
errors.push({
message: msg,
severity: monaco.MarkerSeverity.Error,
source: 'validator'.code: 'parser'. getPositionByToken(offendingSymbol), }); }}}); parser.start();return errors;
};
Copy the code
Of course, you can also use the above hover implementation of the parser to implement custom language errors, such as the need to do a variable is not defined, the number of function parameters error etc..
conclusion
With this parser we can do more than just assemble the array into the format Monaco wants. I’m not going to demonstrate any of the other functions here, but if you’re interested, you can explore them for yourself. I believe ANTLR will play a big role in the front end. The security front end team of Hangzhou Bytedance Tiktok community has been hired. The team atmosphere is good, and the delivery address is recommended