Original address: juejin.cn/post/684490…

In the computer world, a Programmer has more or less heard of the terms editor, interpreter, machine code, and bytecode. Now let’s go a little deeper and ask:

  • What do these nouns denote respectively?
  • How do editors and interpreters work?
  • What’s the connection between them?

The purpose of this article is to clarify these questions.

Type of code

Computer code can be classified (from lowest to highest) according to the degree of encapsulation:

  • Microcode(Microcode): Microcode isA code that directly controls the CPUBy separating the machine instructions from the related circuit, the machine instructions can be designed and modified more freely, regardless of the actual circuit architecture (in traditional architecture, the OPERATION of the CPU is written directly on the circuit board, requiring physical modification).
    • Also called firmware.
    • Usually stored on ROM, not open for modification.
    • Applies only to the specific hardware it was designed for.
    • At the bottom of the software hierarchy.
  • Machine codeMachine code: code that can be executed directly by the CPU.
    • This is converted to machine code for execution on a CISC computer.
    • Execute directly on RISC architecture computers.
  • Object code: Generally machine code, but only part of it, is used by Linker to link multiple Object code files into a complete executable program (e.g. references to the standard library in C).
  • BytecodeBytecode: usually refers toIt’s already compiled, which needs to be translated by the interpreter to become machine codeThe middle code. Such as Java bytecode.
    • Is designed for efficient execution of the interpreter.
    • Cross-platform.

Level of Programming Language

Programming languages can be divided into two classes based on the degree to which they abstract (encapsulate) machine language:

  • Low-level programming language(Low-level programming languages): No or jin provides very little encapsulation, close to machine languages.
    • Machine language and Assembly language.
    • Code can usually only be executed on a particular platform.
    • High execution efficiency.
    • Poor readability.
    • Low development efficiency.
  • High-level programming language(High-level programming language): A highly encapsulated machine language that requires a compiler or interpreter to convert to machine code for execution. Such as C,C++,Java,Python and so on.
    • Execution efficiency is low.
    • Good readability.
    • High development efficiency.

Assembly languages, which require code to be converted to machine code by an assembler before they can be executed, are also considered high-level languages.

The diagram below:

What are the different types of programming languages?

Programs written in high-level languages are either executed directly by an interpreter of some kind, or are converted by compilers (and assemblers and linkers) into machine code, which is then executed by the CPU.

The compilers and interpreters are introduced.

Compiler

A compiler is a computer program that translates code in one programming language into code in another programming language. Usually refers to the translation of high-level language code into low-level language code.

The main goal is to translate high-level language code that humans can write, read, and maintain into machine code that computers can read and run.

There are also several classes of compilers:

  • Cross-compiler: Translated output code can run on different platforms (different cpus or operating systems).
  • Bootstrap Compiler: A compiler written in the input language to be compiled, whose initial core version is generated in another language (usually assembly language).
  • Decompiler: Code that translates low-level language code into high-level language code.
  • Source-to-source compiler: Code that is translated from one high-level language to another.

The workflow of the compiler

A compiler’s workflow typically consists of the following steps (executed in sequence):

  • preprocessing
  • lexical analysis
  • parsing
  • semantic analysis (syntax-directed translation)
  • conversion of input programs to an intermediate representation
  • code optimization
  • code generation

The diagram below:

Image: The Thing from Another World

Interpreter(Interpreter)

An interpreter is a computer program that directly executes high-level language code without having to compile the code into machine code.

  • Advantages: Eliminates the burden of compiling the entire program and modularizes the program by splitting it into multiple parts.
  • Cons: The interpreter is like a middleman. Every time you run a program, you have to convert the code to another language before you run it, so the interpreter’s programs run slowly.

There are generally three strategies for the interpreter to execute code:

  • Code that directly runs a high-level programming language (such as the Shell’s built-in interpreter).
  • Convert the code to efficient intermediate code (such as Bytecode) and execute it immediately (without outputting intermediate code).
  • The interpreter’s built-in compiler compiles the high-level language code into intermediate code, and then executes it.

The contrast with the compiler’s execution is shown below:

Compiler vs Interpreter: Complete Difference Between Compiler and Interpreter

reference

  • Compiler
  • Interpreter
  • Compilers and Interpreters
  • Object code