· 2015/11/24 12:13

From: resources.infosecinstitute.com/reverse-eng…

0 x00 profile


In code obfuscation, virtual machines are used to run different sets of machine instructions on a single program. Virtual machines, for example, can run ARM instruction sets on 32-bit x86 machines. A virtual machine used for code obfuscation is quite different from a normal, operating system capable virtual machine (such as VMware), which is used to perform a limited set of instructions for a specific task.

Once you understand the virtual machine instruction set execution mechanism of the relevant code obfuscator, it is relatively easy to reverse engineer a program protected by a virtual machine that uses the instruction set. It only takes a little time to explore the architecture’s instruction set opcodes. Unfortunately, most of today’s virtual machine code obfuscators use custom instruction sets. In other words, each instruction is assigned a custom (usually random) opcode and a custom format, and the reverse engineer needs to reverse decode the meaning of each opcode. This is a hell of a thing! For example, let’s look at the differences between 32-bit x86 instruction sets and the custom instruction sets we’ll cover in this article:

Obviously, these instructions assign the memory byte specified by the second operand to the register of the first operand. However, the binary opcode representation of the two instructions is different. The 0x56 opcode of the second instruction is a completely random number. The second byte of each instruction represents the register needed by the opcode, with each of the four bits representing a register.

Before we get into the reverse-engineering example, we need to know how the virtual machine code obsorption technique works behind the scenes: The first thing a virtual machine does after startup is request an “Address space” in its process virtual address space; in other words, it request the required memory space, stacks, and registers. The virtual machine then loads the opcode file and executes. The execution of the code is done by a VM loop. In this loop, the virtual machine’s processor parses each predetermined opcode and operand and executes it iteratively through the instruction set. Until the VM loop encounters a specified exit opcode.

0x01 For example


I spent some time writing a custom instruction set virtual machine in C, and the full source code is available at the end of this article. As you can guess, a single virtual machine can’t do anything. That’s why I also wrote a little CrackMe app. I also invite you to add more features to this little guy!

As mentioned in the introduction, the virtual machine uses a custom instruction set, and the virtual machine loads the opcode files into the “Address Space” after the initialization phase.

Let’s make sure the opcode files are in the same directory as the virtual machine, and then execute. Enter a random string of passwords to see:

Password authentication failed!

Our goal now is to find the correct password for this program. Start by looking at the opcode file (vm_file) and opening it with a hexadecimal editor:

You can see that in the vm_file file there are things like “Right pass! “, “‘ll pass! “And” Password: “. Next, reverse the virtual machine and open it with IDA.

IDA opens the VIRTUAL machine and we directly locate the virtual address of the VM loop: 0x00401334. The graph below shows that the program is quite large, but there must be a way to fix it if we find the right entry point.

Let’s see what the entry function does:

    push    ebp
    push    edi
    push    esi
    push    ebx
    sub     esp, 2Ch
    mov     esi, [esp+3Ch+arg_0]
    mov     ebx, [esp+3Ch+arg_4]
    mov     ax, [ebx+0Ah]
    lea     ebp, [esi+1200h]
loc_40134D: ; This is where the loop starts
    movzx edx, ax
    mov     cl, [esi+edx]
    lea     edx, [eax+1]
    mov     [ebx+0Ah], dx
    sub     ecx, 10h
    cmp     cl, 0E1h     ; switch 226 cases
    jbe     short loc_40136C
Copy the code

The “mov Cl, [ESI +edx]” instruction reads a byte into cl, and apparently the CL register contains only the opcodes. The opcode is located through the ESI and EDX registers. It is clear from the previous section that EDX contains only one WORD (16 bits) while ESI contains DWORD (32 bits). So ESI actually points to the VM code snippet, while DX points to a pointer to our VM’s current instruction (index of the current opcode in the file).

After reading the bytes correctly we notice that the value of the DX register is saved to [EBX + 0AH]. This location is the register space allocated by the virtual machine. We now know that the EBX register indicates the location in memory of the file data pointed to by the ESI register.

Before comparing, we notice that the compiler uses a compilation optimization: subtract 0x10 from the value of each opcode before accessing the Switch table.

loc_40136C:
    movzx ecx, cl
    jmp     ds:switchTable[ecx*4] ; switch jump
Copy the code

The Switch table is quite large, but it can calculate dynamic addresses much faster. You can debug this program in Win32 using OllyDbg or IDA.

0x02 First instruction


The first switch takes us to a small process:

We are now in the “case 0x18” opcode because the compiler has added a subtraction operation to optimize the code. If you go back and check vm_FILE now, the first byte is 0x18. The opcode seems to require some operands, so the VM reads one more byte into the DX register. Next, the VM’s instruction pointer [EBX+0AH] is updated to EAX+2, which means IP(Instruction Pointer) points to the next byte. After that, the byte read is compared to 3, and if it is greater than 3 it leaves the loop and throws an exception. No exception is thrown in our example, because the operand in the binary is equal to 0x01, so the program does not jump. And then we get here:

Just to remind you, EBX is a pointer to the virtual machine’s array of registers, so the first instruction initializes [EBX+1*2] (the second register) to 0.

Now, we have enough information to determine that the VM contains four registers, which we can call R0, R1, R2, and R3.

The rest of the code loads two bytes (0x250 in the big endian) of data from the file into register R1. The VM’s instruction pointer then points to the next instruction, at offset 0x04 of the file. Finally, JMP jumps to the VM loop at loc_40134D to start the next instruction.

Until now, we only knew what the first instruction was, and it was a simple MOV instruction. This directive can be rewritten in the following format:

    MOV R1, 250H
Copy the code

0x03 Second instruction


Let’s look at the next opcode (0xAF) :

The first code block is the same as the mov instruction before. Obviously, this is typical code that needs to use a register as an operand, in our case, the R1(0x01) register. Next it accesses the register of [EBX+0CH]. We know that this register is definitely not R0, R1, R2, R3. Because R3 is stored in EBX+6. We also know that this is not an IP instruction pointer because it is located at [EBX+0AH]. So to figure out what this register is, we need to go back and check its initialization in the main function:

.text:00402703 mov     word ptr [eax+0Ch], 256
Copy the code

Going back to our analysis, we notice that we get the value of this register, subtract one, and then compare it to 0xFFFF. Because the register is initialized to 256, it will not equal 0xFFFF until its value is 0 and subtracted by one. If this register is equal to 0xFFFF, the VM exits the loop. Since this is our first execution, we conclude that [EBX+0CH] must equal 255.

The next two instructions read R1 (0x250) and save the value to the DX register. Then you get an interesting command:

mov [esi+eax*2+1000h], dx
Copy the code

ESI, if you recall, is the base address that points to the code and data areas. In addition, ESI+1000H spans 4K address space. Therefore, we can assume that ESI+1000H is a VM “address space” pointing to a different “section”.

We can repeat this operation in pseudocode:

#! c WORD section[256]; [...]. Section [-- reg] = R1;Copy the code

It looks as if this is a stack structure, and the value of register R1 is saved to the position where the stack pointer is decrement by one. It is safe to assume that the 0xAF opcode represents the PUSH instruction. Therefore, the meaning of this opcode instruction can be interpreted as: PUSH R1.

Now we know that [EBX+0CH] is the stack pointer to the VM, which is 256*sizeof(uint16_t). In addition, if you want to compare the VM stack pointer to the x86 machine stack pointer, you can see that the VM stack pointer is just an array index, while the x86 stack pointer is a register (ESP).

0x04 Third instruction


Then the third opcode (0xC2) :

The meaning of this opcode seems to be reading WORD data at the top of the stack. But before reading, it checks to see if the stack is empty and throws a VM exception if it is. Because we already had a value pushed in, we know that the stack is not empty. After saving the data at the top of the stack to the DX register, the stack pointer +1. We also know that the value of DX is now 0x250 (part of the code and data area). Then, make sure that the value at the top of the stack does not exceed 0x1000 (the size of the Address space). Printf is then called as an argument to the string pointed to by [ESI+DX]. In our case, the 0x250 byte vm_FILE holds the string “Password:”, which will be printed to the screen.

We can conclude that the 0xC2 instruction needs to PUSH the string offset onto the stack and POP it out printf.

As you can see, after reversing the opcodes here we’ve reached the code that prints “Password:”. You may have noticed that we can simplify the execution actions represented by each opcode with a single instruction. Instead of analyzing these opcodes step by step, we will analyze them. But these days you should be able to reverse engineer a protected application, or even create your own protected application.

0x05 Cracking the Password


Here’s how to quickly find the correct password:

Open the vm_file file with a hexadecimal editor and extract 256 Random bytes at offsets 0x80 to 0x17F, which we can call Random. Each byte of the password entered by the user is run against Random xor and compared against the array predetermined by the VM_file file at offset 0x240.

I’ve given a password generator in the references section below. Compile and execute it to get the correct password:

0 x06 reference


  • VM source code
  • Opcodes file (vm_file)
  • Hex dump of vm_file for those who don’t want to download it
  • Solution code