Starting to zhihu: zhuanlan.zhihu.com/p/427309936
preface
This is a Chinese chess program with TypeScript type operations. I live in Bengbu, so I can directly take out my own TypeScript type gymnastics ceiling and implement a Lisp interpreter with TypeScript types.
I declare an acc(x) function that increments recursively from x to 0. For example, acc(3) is 3 + 2 + 1 + 0 = 6.
type _Test = EvalProgram<` (def zero 0) (def acc (x) (if (== x zero) x (+ x (acc (- x 1))) ) ) (acc 3) `>
Copy the code
We already know that (acc 3) is equal to 6, so let’s verify that we use the type to calculate it. It is really 6:
You can just click on the link below to play it online, try it out and come back to see how it works: TypeScript Playground
The basic structure
We build a simple Lisp interpreter in two main steps:
- Parsers – Parses strings into abstract syntax trees (AST)
-
- Lexical analysis – Parsing string into an array of tokens (can be interpreted as breaking up into separate words)
- Parsing — Parses the above Token array to generate an abstract syntax tree (AST)
- Evaluator – Evaluates the abstract syntax tree to the final result
-
- instruction
-
- Declare a variable
- Declare functions
- evaluation
-
- Evaluation of basic operational expressions, such as:
+
、-
、= =
、! =
- Conditional expression evaluation:
if
- Function call expression evaluation, such as:
(acc 3)
- Evaluation of basic operational expressions, such as:
So the code that we interpret in Lisp is actually broken up into these chunks, and you can look at them as you look at them.
Basic knowledge supplement
Before starting to implement our Lisp interpreter, we need to add some basic functional language knowledge and skills. Although the front end is always functional, there are very few front ends with pure functional languages, so it is necessary to supplement the basic skills.
TypeScript types are a separate “pure functional programming language”
- Q: Why can we do so many things with TypeScript Type? For example, in this article, write a Lisp interpreter?
- A: Because TypeScript types are A separate “pure functional programming language.” So we need to look at TypeScript types the way we look at a programming language to understand why they do so much.
Here’s a very simple example to rediscover TypeScript types. The following JavaScript code does the same thing and gets the same results as TypeScript type code:
/ / javascript implementation
const foo = (a, b) = > a + b;
const result = bar(1.2) / / = > 3
// typescript type implementation
type Foo<A extends number,B extends number>
= Add<A, B>
type Result = Foo<1.2> / / = > 3
Copy the code
Try the example above online: TS Playground
Type and value
A type can be thought of simply as a collection of values that contain only one value, such as the literal type 1. Hello may have an infinite number of values, such as STRNG, number, and so on. When we do type operations, for example, Add<1, 2>, we expect the result to be 3 instead of number, so we separate the “type” and “value” from the type when we do type operations:
For example, here is a comparison of common examples:
Value (type containing only one element) | Type (a type that contains more than one element) |
---|---|
type A = 1 | Type AType = number, 1 |
type A = ‘hello’ | Type AType = string, ‘foo’ |
type A = [1, 2, ‘hello’] | AType = number[], [number, string] |
type A = { hello: ‘world’ } | type AType = { hello: string } |
Plus two common types:
- Any – complete
- Never — an empty set
Extends and branching conditions
Finally, there is an extends operation. A extends B? The meaning of C: D is to determine whether set A is A subset of set B, if so then return C otherwise return D.
Extends has two main uses when doing an operation:
< value > extends < value >? :
To determine whether two values are equal, you can use javascript analogies< value > === < value >? :
< value > extends < type >? :
To determine whether a value belongs to a type, the analogy isTypeof < value > === < type >? :
Here’s an example:
type Func<T> =
T / * * / extends string / * type * /? (
T / * * / extends 'foo' / * * / ? 'is string foo'
: T / * * / extends 'bar' / * * / ? 'is string bar'
: 'unexpected string value'
)
: T / * * / extends number / * type * /? (
T / * * / extends 0 / * * /? 'is number 0'
: T / * * / extends 1 / * * /? 'is number 1'
: 'unexpected number value'
)
: 'unexpected value type';
Copy the code
Online experience: TypeScript Playground
Variable declarations
- through
type T = ...
Declare global variables (TS file scope) - through
A extends infer B
Declare local variables, analogous to those in purely functional languageslet B = A in xxx
.
Here’s an example:
type A = 'hello'; // Declare global variables
type B = [A] extends infer T ? (
T // => < span style = "max-width: 100%; clear: both;
) : never // Declare local variables
Copy the code
Function declarations and calls
Thinking of TypeScript generics as function declarations and calls is an analogy to JavaScript function declarations:
type Func<A extends number, B extends string = 'hello'> = [A, B]
// ↑ and ↑ are used
// Function name Parameter Type Parameter Type Default Value Function body
type Test = Func<10.'world'> // => [10, 'world']
function Func(A: number, B: string = 'hello') { return [A, B] }
const Test = Func(10.'world') // => [10, 'world']
Copy the code
The absence of higher-order functions does not affect expressiveness
It’s not accurate to think of TypeScript types as a purely functional programming language. TypeScript types, for example, lack the signature ability of first-class-function, which means they have no higher-order functions, but that doesn’t affect their expressiveness. If we pass an environment (closure) as an argument to a function, that means we don’t need higher-order functions to achieve the closure effect.
How expensive is it to generate a JS function in modern browsers? Does the React hooks design affect performance by generating new functions frequently? 237 Agree · 9 Comment Answered
Pattern matching
We know that extends can be followed by a infer that can be used to extract variables. Normally, this is analogous to JavaScript deconstruction. In fact, this is called pattern matching. Here are a few common techniques:
Used to match an array and the first element of a string. Often used to recurse each element in an array:
// Handle common array matches recursively
type Test0 = [0.1.2] extends [infer Head, ...infer Tail]
? { head: Head, tail: Tail }
: never;
// Handle common string matches recursively
type Test1 = `abcdefg` extends `${infer Head}${infer Tail}`
? { head: Head, tail: Tail }
: never;
Copy the code
Matches multiple variables at once by constructing a new structure:
type Func<A, B> = [A, B] extends [1.2]?'match 1 2'
: 'not match';
Copy the code
Avoid type loss of the Infer variable by constructing a new structure with type:
type OnlyNumber<N extends number> = N;
type Tmp<A extends number> = { a: A };
type Test0 = { a: 10 } extends Tmp<infer A> Infer A can only be of type number
? OnlyNumber<A> / / is not an error
: never;
Copy the code
Recursion and tail recursion
I covered some basic grammar and techniques in the previous section, but here’s the problem. Where did the loop go? Here’s an anti-common sense concept:
Recursion is the same thing as a loop! So in purely functional programming languages recursion is often used instead of looping.
And the following inferences can be obtained:
- Any loop can be implemented recursively
- Any recursion can be implemented in a loop
Ordinary recursive
Without further elaboration, use the following two examples to illustrate the implementation of TypeScript recursive operations:
// go through the number group
type ArrayStuct<Head extends string, Tail extends string[]> = [Head, ...Tail];
type Join<Arr extends string[], Sp extends string>
= Arr extends[]?' '
: Arr extends ArrayStuct<infer Head, []> ? Head
: Arr extends ArrayStuct<infer Head, infer Tail> ? `${Head}${Sp}${Join<Tail, Sp>}`
: never;
type Test = Join<['foo'.'bar'.'hello'].', '>
// => type Test = "foo,bar,hello"
Copy the code
/ / traversal tree
type NodeType = string | NodeType[];
type NodeArrayStruct<Head extends NodeType, Tail extends NodeType[]> = [Head, ...Tail]
type NodeArrayToString<NodeArray extends NodeType[]>
= NodeArray extends NodeArrayStruct<infer Head, []> ? NodeToString<Head>
: NodeArray extends NodeArrayStruct<infer Head, infer Tail> ? `${NodeToString<Head>}.${NodeArrayToString<Tail>}`
: ' ';
type NodeToString<Node extends NodeType>
= Node extends string ? Node
: Node extends NodeType[] ? ` (${NodeArrayToString<Node>}) `
: ' ';
type Test = NodeToString<['123'['456'.'789'], ['1112']] >// => type Test = "(123, (456, 789), (1112))"
Copy the code
Tail recursion & Tail recursion loop & general recursion loop
In purely functional programming languages, there is no way to replace loops with recursion, but there is a very embarrassing problem of “stack bursting”, so functional programming uses tail recursion (tail call) to solve this problem.
TypeScript type operations have only 50 layers in the stack, making it hard to do any complex operations. However, TypeScript supports tail-recursive optimization of type operations in 4.5-Beta, and tail-recursive writing limits can reach 1000 layers. Far more than the original 50 floors. Announcing the TypeScript 4.5 Beta
Why tail recursion can be optimized
General recursive loop method
Loop to tail recursion
As discussed in the tail Recursion section, recursion and loops are actually equivalent, and how to convert recursion/tail recursion into loops. Since this is equivalent, there must be a way to convert the loop to tail recursion.
What exactly is a cycle? It is nothing more than a piece of code that is continuously executed. As in the higher-order function summary above, we give a definition to the loop:
We can code the above definition to get a general way to convert a loop function to a tail recursion. For example, we want to convert the following loop to a tail recursion:
function join(arr) {
const l = arr.length;
let i = 0
let result = ' '
while (i < l) {
result += arr[i]
if(i ! = l -1) {
result += ', '
}
i++
}
return result
}
Copy the code
Write the formula above to convert the above loop to the following tail recursion:
// Loop
const loop = (test, process, env) = > test(env)
? loop(test, process, process(env))
: env
function join(arr) {
const env = {
arr,
l: arr.length,
i: 0.result: ' ',}const newEnv = loop(
env= > env.i < env.l, // Test
env= > { // Process
constnewEnv = {... env} newEnv.result += newEnv.arr[newEnv.i]if(newEnv.i ! = newEnv.l -1) {
newEnv.result += ', '
}
newEnv.i++
return newEnv
},
env // Env
)
return newEnv.result
}
Copy the code
We can convert any Loop to a tail-recursive form, but because TypeScript types don’t support higher-order functions, we can only couple Loop/Process and Test and simplify the Environment. For example, in the tail-recursive version of Join, Loop/Process/Test and Environment are not explicitly defined, but are actually coupled in the function:
// go through the number group
type ArrayStuct<Word extends string, Tail extends string[]> = [Word, ...Tail];
type Join<Arr extends string[], Sp extends string, Result extends string = ' '>
= Arr extends[]? Result : Arrextends ArrayStuct<infer Word, []> ? `${Result}${Word}`
: Arr extends ArrayStuct<infer Word, infer Tail> ? Join<Tail, Sp, `${Result}${Word}${Sp}`>
: never;
type Test = Join<['foo'.'bar'.'hello'].', '>
// => type Test = "foo,bar,hello"
Copy the code
Tail recursion traverses the tree
How do you use tail recursion to traverse a data structure like a tree? Combine what we learned in the last two sections:
- Recursive traversal tree –(general recursive loop)–> loop traversal tree
- Loop through the tree –(loop to tail recursion)–> tail recursion traverses the tree
To reemphasize the point, when iterating through a tree, two dimensions of information need to be recorded in order to know where I am iterating now and where I need to iterate next:
- Vertical (shown in green) : Indicates the depth of my traversal, usually represented by stacks.
- Horizontal (shown in red) : indicates the number of children I have traversed. There are many ways to do this as long as you can mark which children have been traversed and which ones have not.
Here is an example of a simple operation adding and subtracting an expression tree. Although this example has a simpler solution (see prefixed expression operations), here is a more general implementation that preserves more context on the stack:
type OperatorType = '+' | The '-';
type NodeType = number | OpNodeType;
type OpNodeType = [OperatorType, NodeType, NodeType];
type UnitOpNode<Op extends OperatorType, Left extends number, Right extends number>
= OpNode<Op, Left, Right>;
type OpNode<Op extends OperatorType, Left extends NodeType, Right extends NodeType>
= [Op, Left, Right];
type FrameType = NodeType;
type StackType = FrameType[];
type PushStack<Frame extends NodeType, Stack extends StackType> = [Frame, ...Stack];
type Eval<Stack extends StackType>
= Stack extends PushStack<infer Top, infer TailStack> ? (
[Top, TailStack] extends [number, []]? Top// Return the result
: Top extends UnitOpNode<'+', infer L, infer R> ? Eval<PushStack<Add<L, R>, TailStack>>
: Top extends UnitOpNode<The '-', infer L, infer R> ? Eval<PushStack<Sub<L, R>, TailStack>>
: Top extends number ? Eval<Return<Top, TailStack>>
: Top extends OpNode<OperatorType, OpNodeType, NodeType> ? Eval<PushStack<Top[1], Stack>>
: Top extends OpNode<OperatorType, number, OpNodeType> ? Eval<PushStack<Top[2], Stack>>
: never
)
: never;
type Return<Value extends number, Stack extends StackType>
= Stack extends PushStack<infer Top, infer Tail> ? (
Top extends OpNode<OperatorType, OpNodeType, NodeType> ? PushStack<OpNode<Top[0], Value, Top[2]>, Tail>
: Top extends OpNode<OperatorType, number, OpNodeType> ? PushStack<OpNode<Top[0], Top[1], Value>, Tail>
: never
)
: never;
type Test = Eval<[['+'['+'.1.2], [The '-'.4.3]]] >// => Test = 4
Copy the code
Online experience: TypeScript Playground
Implement the Lisp interpreter
Now that we’ve replenished the basics we’re going to implement the interpreter, and we’re going to start implementing the interpretation.
Basic addition, subtraction, multiplication and division
TypeScript types do not operate directly on values, even the simplest 1 + 1. This problem plagued gymnasts for a long time until we discovered that arrays could fetch values from [‘length’]. Since we could not manipulate values directly, But we do this by manipulating arrays instead of operands and evaluating them, and we came up with the idea and one of our brilliant colleagues implemented it, adding, subtracting, multiplying and dividing as follows:
// Basic operations
export type NArray<T, N extends number> = N extends N ? (number extends N ? T[] : _NArray<T, N, []>) : never
type _NArray<T, N extends number, R extends unknown[]> = R['length'] extends N ? R : _NArray<T, N, [T, ...R]>
type NArrayNumber<L extends number> = NArray<number, L>
/ / add
export type Add<M extends number, N extends number> = [...NArrayNumber<M>, ...NArrayNumber<N>]['length']
/ / subtraction
export type Subtract<M extends number, N extends number> =
NArrayNumber<M> extends [...x: NArrayNumber<N>, ...rest: infer R] ? R['length'] : unknown
// It is mainly used for derivation of multiplication and division; Because otherwise it will Subtract the return type for number | unknown error
type _Subtract<M extends number, N extends number> =
NArrayNumber<M> extends [...x: NArrayNumber<N>, ...rest: infer R] ? R['length'] : -1
/ / the multiplication
type _Multiply<M extends number, N extends number, res extends unknown[]> =
N extends 0 ? res['length'] : _Multiply<M, _Subtract<N, 1>, [...NArray<number, M>, ... res]>export type Multiply<M extends number, N extends number> = _Multiply<M, N, []>
/ / division
type _DivideBy<M extends number, N extends number, res extends unknown[]> =
M extends 0 ? res["length"] : _Subtract<M, N> extends -1 ? unknown : _DivideBy<_Subtract<M, N>, N, [unknown, ...res]>
export type DividedBy<M extends number, N extends number> = N extends 0 ? unknown : _DivideBy<M, N, []>;
Copy the code
Lexical analysis
Because Lisp S expressions are very easy to parse, the parsing process is very simple. The core code is as follows. The overall logic is to try each type of Token one by one until the Token can be parsed, and proceed to the next step:
type Tokenize<Input extends string, Tokens extends TokenType[] = []>
= Input extends ' ' ? TokenResult<Tokens, ' '>
: ParseSpace<Input> extends TokenResult<infer R, infer Next> ? Tokenize<Next, [...Tokens, ...R]>
: ParseIdentifier<Input> extends TokenResult<infer R, infer Next> ? Tokenize<Next, [...Tokens, ...R]>
: ParseNumber<Input> extends TokenResult<infer R, infer Next> ? Tokenize<Next, [...Tokens, ...R]>
: ParsePair<Input> extends TokenResult<infer R, infer Next> ? Tokenize<Next, [...Tokens, ...R]>
: ParseSymbol<Input> extends TokenResult<infer R, infer Next> ? Tokenize<Next, [...Tokens, ...R]>
: TokenError<'Unexpted char'>
Copy the code
The ParseXXX implementation is to try a string at the beginning, a string at the beginning, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end, a string at the end.
type ParseIdentifier<Input extends string>
= Input extends `${infer Char}${infer Next}` ? (
Char extends AlphaChars ? _ParseIdentifierBody<Next, Char>
: TokenError<'Can not match identifier'>
)
: TokenError<'Can not match identifier'>;type _ParseIdentifierBody<Input extends string, Matched extends string>
= Input extends `${infer Char}${infer Next}` ? (
Char extends AlphaChars | NumberChars ? _ParseIdentifierBody<Next, `${Matched}${Char}`>
: TokenResult<[Matched], Input>
)
: TokenResult<[Matched], Input>
;
Copy the code
After this process, a string is finally converted to a string of tokens array:
Syntax analysis
The core code of syntax analysis is as follows, and the core process is:
- If you encounter
(
Symbols into the stack - If you encounter
)
The symbol goes off the stack, adding the entire array at the top of the stack as an item to the new array at the top of the stack - If other symbols are encountered, they are added directly to the array at the top of the stack
type NodeType = TokenType | NodeType[];
type _SafeTokens<Tokens> = Safe<Tokens, TokenType[], []>;
type _SafeToken<Token> = Safe<Token, TokenType, 0>;
type NodeStackType = NodeType[][];
type NodeStackPush<Stack extends NodeStackType, Top extends NodeType[],> = [...Stack, Top];
type NodeStackPopResult<Stack extends NodeStackType, Top extends NodeType[]> = { stack: Stack, top: Top };
type NodeStackPop<Stack extends NodeStackType>
= Stack extends [...infer Tail, infer Top] ? NodeStackPopResult<Safe<Tail, NodeStackType, []>, Safe<Top, NodeType[], []>>
: ErrorResult<'Can not pop empty stack'>;
type ParseError = ErrorResult<'ParseError'>;
type _Parse<Tokens extends TokenType[], Stack extends NodeStackType = [[]]>
= Tokens extends [infer Token, ...infer Next] ? (
Token extends '(' ? _Parse<_SafeTokens<Next>, NodeStackPush<Stack, []>>
: NodeStackPop<Stack> extends NodeStackPopResult<infer TailStack, infer Top> ? (
Token extends ') ' ? (
NodeStackPop<TailStack> extends NodeStackPopResult<infer _TailStack, infer _Top>
? _Parse<_SafeTokens<Next>, NodeStackPush<_TailStack, [..._Top, Top]>>
: ParseError
)
: _Parse<_SafeTokens<Next>, NodeStackPush<TailStack, [...Top, Safe<Token, TokenType, 0>]>>
)
: ParseError
)
: Stack;
Copy the code
Declare variables and functions
In the process of executing with code, there needs to be an environment to maintain the variable that the code comes from. For example, we need to execute x + 1. This expression cannot be computed directly, but now we need to find out what x is from the environment. Declare the type of the environment and define how to store variables and functions in the environment:
type Func<Args extends string[], Body extends NodeType> = { args: Args, body: Body }
type FuncType = Func<string[], NodeType>
type ValueType = number | FuncType;
type EnvType = { [name: string]: ValueType }
type MergeEnv<Env extends EnvType, Append extends EnvType> = Omit<Env, keyof Append> & Append;
Copy the code
When the code executes to def, we need to put the declarations of variables and functions in the environment:
type _DefStat<Name extends string, Expr extends NodeType> = ['def', Name, Expr];
type _DefFunc<Name extends string, Args extends string[], Body extends NodeType> = ['def', Name, Args, Body];
type _EvalProgram<Stats extends NodeType[], Env extends EnvType, R extends number>
= Stats extends [infer Stat, ...infer NextStats] ? (
NextStats extends NodeType[] ? (
Stat extends _DefFunc<infer Name, infer Args, infer Body> ? (
_EvalProgram<NextStats, MergeEnv<Env, { [name in Name]: Func<Args, Body> }>, 0>
)
: Stat extends _DefStat<infer Name, infer Expr> ? (
_EvalExpr<[EvalStackFrame<'id', [Expr], [], Env>]> extends EvalResult<infer R, infer _> ? (
_EvalProgram<NextStats, MergeEnv<Env, { [name in Name]: R }>, R>
)
: EvalError<'8'>
)
: Stat extends NodeType ? (
_EvalExpr<[EvalStackFrame<'id', [Stat], [], Env>]> extends EvalResult<infer R, infer _>
? _EvalProgram<NextStats, Env, R>
: EvalError<'7'>) :EvalError<'9'>) :EvalError<'10'>
)
: EvalResult<R, Env>
;
Copy the code
The result is the environment in the env field below:
Evaluate the expression
The expression evaluation here is similar to the recursion and tail recursion section, except that it takes the Env argument, which holds the value of the variable in the current environment. The core code is as follows:
type _EvalExpr<Stack extends EvalStackType>
= EvalStackPop<Stack> extends EvalStackPopResult<infer TailStack, infer Frame> ? (
Frame extends EvalStackFrame<infer Callee, infer Left, infer Right, infer Env> ?
[Callee, Right, Left] extends ['if', [infer Test], [infer Consequent, infer Alternate]] ? (
Test extends 0 ? _Return<Safe<Alternate, NodeType, 0>, TailStack>
: _Return<Safe<Consequent, NodeType, 0>, TailStack>
)
: (Left extends [infer E, ...infer Next] ? (
E extends number ? _EvalExpr<EvalStackPush<TailStack, EvalStackFrame<Callee, Safe<Next, NodeType[], []>, [...Right, E], Env>>>
: E extends string ? (
Env[E] extends number ? _EvalExpr<EvalStackPush<TailStack, EvalStackFrame<Callee, Safe<Next, NodeType[], []>, [...Right, SafeNum<Env[E]>], Env>>>
: EvalError<'0'>
)
: E extends [infer NextCallee, ...infer NextLeft] ? (
NextCallee extends string ? (
NextLeft extends NodeType[] ? (
_EvalExpr<
EvalStackPush<
EvalStackPush<TailStack, EvalStackFrame<Callee, Safe<Next, NodeType[], []>, Right, Env>>,
EvalStackFrame<NextCallee, NextLeft, [], Env>
>
>
) : EvalError<'1'>) :EvalError<'2'>) :EvalError<'3'>
)
: (
[Callee, Right] extends ['+', [infer L, infer R]] ? _Return<SafeNum<Add<SafeNum<L>, SafeNum<R>>>, TailStack>
: [Callee, Right] extends [The '-', [infer L, infer R]] ? _Return<SafeNum<Subtract<SafeNum<L>, SafeNum<R>>>, TailStack>
: [Callee, Right] extends ['! = ', [infer L, infer R]] ? _Return<L extends R ? 0 : 1, TailStack>
: [Callee, Right] extends ['= =', [infer L, infer R]] ? _Return<L extends R ? 1 : 0, TailStack>
: [Callee, Right] extends ['id', [infer V]] ? _Return<Safe<V, NodeType, 0>, TailStack>
: (
Env[Callee] extends Func<infer Names, infer Body>
? _EvalExpr<EvalStackPush<TailStack, EvalStackFrame<'id', [Body], [], FuncEnv<Names, Right, Env>>>>
: EvalError<'4'>))) :EvalError<'5'>) :EvalError<'6'>;
type _Return<Expr extends NodeType, Stack extends EvalStackType>
= EvalStackPop<Stack> extends EvalStackPopResult<infer TailStack, EvalStackFrame<infer Callee, infer Left, infer Right, infer Env>>
? (
Expr extends number ? _EvalExpr<EvalStackPush<TailStack, EvalStackFrame<Callee, Left, [...Right, Expr], Env>>>
: _EvalExpr<EvalStackPush<TailStack, EvalStackFrame<Callee, [Expr, ...Left], Right, Env>>>
) : (
Expr extends number ? EvalResult<Expr, {}>
: EvalError<'7'>)Copy the code
It is important to note that if expressions need to evaluate whether the conditional expression is true before they are executed, so if needs to be processed separately.
Also consider how the function is called? In fact, the function call here is very simple, we just need to replace the function call with an expression with a parameter environment, the process is even simpler than if.
Write in the last
After writing this article, I’m no longer ready to play type gymnastics, I’m really up to the ceiling, and the next step is to dig into the TypeScript type system itself and see if there’s anything else I can do.
End scatter flower ~