What is AST
Abstract Syntax Tree (AST) is an Abstract representation of source code Syntax structure. It represents the syntactic structure of programming in the form of a tree, where each node represents a structure in the source code.
During JavaScript compilation, the code is mapped to AST after lexical analysis and syntax analysis, so the CODE can be analyzed and transformed using AST. Tool-class libraries that deal with code analysis, such as WebPack, Babel, ESLint, etc., all have an AST behind them.
Through online toolsAST explorerWe can compile console.log(“AST”) into the following structure.
How do I operate the AST
Here we manipulate the AST using Babel, a JS compiler that can convert one form of JS code into another
At the macro level, Babel’s parsing is divided into three steps: Parse => transform =>Generate
Let’s start by installing Bable’s core package
npm install @babel/core
Copy the code
And use the following nodeJS toolkits
// Convert the JS source into a syntax tree
const parser = require("@babel/parser");
// Iterate over and process the AST
const traverse = require("@babel/traverse").default;
// Manipulate nodes, such as determining node types, generating new nodes, etc
const t = require("@babel/types");
// Convert the syntax tree to source code
const generator = require("@babel/generator").default;
// Manipulate files
const fs = require("fs");
Copy the code
Confuse code
This is an obfuscation code for an airline ticket website. It’s 1,854 lines long.
Use the AST Explorer to view the AST syntax tree structure
Parse and process the AST
We use Babel to map this code to AST
// Read obfuscation code
var jscode = fs.readFileSync(file_path + file_name, {
encoding: "utf-8"
});
// Convert code to AST syntax tree
let ast = parser.parse(jscode);
Copy the code
As we continue to examine the obfuscation code, we can see that this obfuscation code extracts all constants into an array after encryption, which is called by a decryption function.
Constant array:
Methods for accessing constants in arrays:
Call:We need to separate the method _0xe014 and run it in node. It is worth noting that there is detection of JS environment in this method. Using this method directly in our node will report an error, so the environment needs to be supplementedOnce the environment is complete, you can happily use the _0xe014 method
Next, we need to replace the visible elements in the code like _0xe014(‘0x1’, ‘oybe’) with the corresponding values in the constant array
Let’s first look at the structure of _0xe014(‘0x1’, ‘oybe’) in the AST
CallExpression Represents functional expressions in AST
interface CallExpression <: Expression {
type: "CallExpression";
callee: Expression | Super | Import;
arguments: [ Expression | SpreadElement ];
}
Copy the code
Arguments is an array, and the element is an expression node, representing the function argument list. Let’s begin the formal parsing:
function traverse_all(ast) {
// Traverses the node and calls the function when the following types are encountered
traverse(ast, {
CallExpression: {
enter: [replaceFunctionToString]
}
})
}
Copy the code
function replaceFunctionToString(path) {
// Process the node
const node = path.node;
// Check the node type and function name, if not, return
if(! t.isIdentifier(node.callee,{name:"_0xe014"})) return;
// take the argument value
let first_arg = node.arguments[0].value;
let second_arg = node.arguments[1].value;
// Call the local _0xe014 function
let value = _0xe014(first_arg,second_arg);
// Replace the CallExpression node with value of the StringLiteral type
path.replaceWith(t.StringLiteral(value));
}
Copy the code
Finally, we convert the processed AST into code
// Process the AST syntax tree
traverse_all(ast);
// Convert the AST to code
let {code} = generator(ast);
// Write to the file
fs.writeFile(file_path + 'decoded.js', code, (err) = >{})
Copy the code
All tangible values such as _0xe014(‘0x1’, ‘oybe’) have been replaced!Stay tuned for more on anti-obfuscation in the next post!