Click on the asynchronous book, the top public account

Every day to share with you IT good books technology dry goods workplace knowledge

Participate in the topic discussion at the end of the article and give asynchronous books every day.

— Asynchronous small editor

The eternal hot spot — the CPULamp, lamp, lamp… – Intel

If you want to select the classic technology music over the past ten years, surely Intel’s advertising music “lamp, lamp, lamp…” It’s bound to be on the list. The familiar blue label “Intel Inside” is also a highly recognizable advertising icon.

The Center Processor Unit (CPU), though well known to the public, has been synonymous with high-end atmosphere for a long time. Do you have the same impression that in the early days of computer magazines there were pages and columns,

Introduce the detailed parameters of CPU chips launched by Intel or AMD, or the pros and cons of MIPS and ARM, or even the gossip and behind-the-scenes of each CPU.

And CPU represents the high-end technology, has been shrouded in mystery under the veil

. The CPU design department always attracts many applicants with its aura.
Just as every boy has a soldier’s dream, it seems that every chip designer has a CPU dream.

Today, xiaobian introduces two books: “How to Design CPU” and “Homemade Programming Language”.

And elicits some thoughts of author Hu Zhenbo on self-made CPU. I hope to provide some guidance for readers interested in this area.

(Published May 2018)

(Published May 2018)

Through the history of

Chip is the cornerstone of the entire electronic information industry. Currently, the global semiconductor market is worth $320 billion. 54% of the world’s chips are exported to China, but domestic chips account for only 10% of the market. China’s chip industry consumes more than us $200 billion in foreign exchange annually, more than oil and commodities, and accounts for a significant proportion of imported goods.

CPU as the “heart” of the chip, it can be said that the “heart of the core”, the domestic industrial strength in this aspect has been relatively weak. It is very important for the development of China to realize the domestic autonomy of CPU, but the mainstream CPU instruction set architecture (such as x86 and ARM) has been monopolized by foreign companies, which need to pay high patent fees and be controlled by others. As a special chip, CPU requires the instruction set architecture to be universal and able to share the ecosystem. Therefore, it is not practical to invent a closed instruction set within the scope of a country. It must be in line with the world’s mainstream architecture. In this context, the open RISC-V architecture has brought great strategic opportunities to the development of China’s CPU chip industry, and it is expected to completely realize the domestic autonomy of CPU and the mainstreaming of architecture.

At present, China is in the critical period of vigorously developing the chip design industry, to achieve the great rejuvenation of the Chinese nation requires the efforts of the majority of scientific research and engineering workers tireless efforts and hard work, need a lot of pragmatic technology backbone like the author to shoulder the important task of domestic chip revitalization. The shortage of talents in the field of CPU in China is the main factor that restricts the development of the industry for a long time. The author of “Teach you to Design CPU by Hand”, as a senior CPU design expert working in the front line for a long time, wrote his experience into a book with full information and vivid words. With the hummingbird E200 series processor core developed by the author’s company as an example, it is very suitable for teaching and hobbyist learning, and has very positive significance for popularizing CPU design technology.

The emerging RISC-V architecture has set off a global upsurge and attracted extensive attention in China. However, due to the lack of good Chinese popular books, many people are still “only hearing the sound of RISC-V, but not seeing its shape”. As one of the first technical experts to contact risC-V architecture in China and successfully develop RISC-V processor, the author opened source the processor core developed by himself in his spare time and wrote a book to explain the details of its implementation in detail, which reflects the author’s high professional level and strong feelings to promote the development of domestic CPU industry.

ISA please take the pot – why domestic cpus have not been successful enough

As we all know, chip is the core field of the development of China’s information industry, and CPU represents the core technology in chip. In this respect, there is a significant gap between China and developed countries. Although after years of efforts, the technology gap has been significantly narrowed, but in the civil and commercial field, there are still not too many domestic CPU figure. What is the reason why domestic commercial cpus have not been successful enough? Next, let’s take a look at the current state of the country’s homegrown CPU companies and the instruction set schools they choose. By analyzing the past and present situation one by one, I believe readers can get answers.

MIPS – Longson and Junzheng

The emerging RISC-V architecture has set off a global upsurge and attracted extensive attention in China. However, due to the lack of good Chinese popular books, many people are still “only hearing the sound of RISC-V, but not seeing its shape”. As one of the first technical experts to contact risC-V architecture in China and successfully develop RISC-V processor, the author opened source the processor core developed by himself in his spare time and wrote a book to explain the details of its implementation in detail, which reflects the author’s high professional level and strong feelings to promote the development of domestic CPU industry.

1. The godson

The Loongson CPU is developed by the Loongson Research Group of the Institute of Computing Technology, Chinese Academy of Sciences, and by Beijing Shenzhou Loongson IC Design Company authorized by the Institute of Computing Technology, Chinese Academy of Sciences. The following is a brief introduction of the Loongson CPU chip.

  • The loongson 1 has a frequency of 266MHz and was first used in 2002, as shown in Figure 1-3.
  • The Loongson 2 has a maximum frequency of 1GHz.
  • Loongson 3A series is a domestic commercial 4 core processor. The latest Loongson 3A3000 is based on the CORE 28nm FDSOI process and is designed to be 4-core 64-bit with a 1.5ghz main frequency and a power consumption of only 30W, making it ideal for notebook platforms.
  • Loongson 3B series is a domestic commercial 8-core processor with a dominant frequency of over 1GHz, accelerated support vector computation, and a peak computing capacity of 128GFLOPS, with a high performance-power ratio. Lonson 3B series is mainly used in high performance computers, high performance servers, digital signal processing and other fields.

2. You are

Domestic MIPS department has another company – Beijing Junzheng. Junzheng and Loongson belong to the MIPS camp. Unlike LoongSon, which focuses on desktop PC processors, Beijing Junzheng is one of the first local IC design companies in China focusing on wearable and Internet of Things. As embedded chip software is generally customized according to demand. As a result, in the smart wearable market, a considerable part of wearable products and application software are specialized, the software ecological chain is relatively short, and the application requirements are diversified. Therefore, it is impossible to use a set of universal solutions to meet the requirements of all people, so no one manufacturer can monopolize in this field. Therefore, the smart wear market is not susceptible to the dominance of x86 and ARM architectures in the PC and mobile phone markets.

Intelligent chip and iot chip of performance requirements is not high, most of the application scenario is more focused on factors such as low power consumption, cheap and size, you are the products fully meet the performance requirements, x86 processor cannot be applied in the field, the ARM camp IC design company subject to the licensing of relatively high, under the condition of the smaller chip production, does not have the price competitiveness. Junzheng has more than 10 years of chip design experience and technology accumulation, its biggest characteristic is high performance power ratio. The first batch of domestic smartwatches, including guo-Shell first-generation smartwatch, Tuman first-generation smartwatch and Tuman second-generation smartwatch, all adopted Junzheng’s solution.

X86 series — North Public chi, Zhaocin and sea light

1. North mass volunteers

Beijing Beidazhong Micro System Technology Co., LTD., founded in November 2002, is an important backbone enterprise in the national IC design industry. In 2005, AMD reached an agreement with the Chinese government, and the Ministry of Science and Technology designated The Microelectronics Center of Peking University to receive the technology license of AMD Geode-2 processor. AMD’s processor is undoubtedly x86 architecture, so China obtained the x86 technology. However, the Geode processor belongs to AMD embedded processor, so the X86 technology authorized by AMD to Peking University belongs to embedded architecture.

2. Signs in the core

Another domestic company using the x86 architecture, Mega Chip, may be better known. It is well known that the core x86 architecture is the core technology of Intel and AMD, and the US government strictly controls the licensing of their technology. However, in addition to Intel and AMD, another Taiwan-based company, VIA, also used to license the x86 architecture. It is said that, as shown in Figure 1-8, the ZX-C processor independently developed by Megabyte was mass-produced in April 2015. It has a 28nm process and a 4-core processor, with a dominant frequency up to 2.0ghz, and supports state secret algorithm encryption. In 2017, Megabyte announced that its latest generation of ZX-D series of 4-core and 8-core universal processors have been successful, and revealed that it will launch a 16nm ZX-E 8-core CPU in 2018.

3. Sea ray

In addition to Shanghai Zhaocin, there is a new born company – Tianjin Haiguang. In 2016, AMD announced an agreement with China’s Tianjin Haiguang Investment Company to license x86 technology to haiguang for a licensing fee, as well as a joint venture to license the company to produce server processors. It is said that in order to open the Chinese high-performance server market, AMD’s x86 license to Chinese companies is likely to be the most cutting-edge x86 technology. For the performance of sea light, it is also worth waiting and seeing.

Power series — Sheng Macro Core

Big Blue IBM’s Power architecture has been the epitome of high performance. IBM partnered with NVIDIA and others in 2013 to form the OpenPower Open Alliance, which allows other companies to license Power architectures. Since then, he has promoted the establishment of China POWER Technology Industry Ecological Alliance, and signed licensing agreements with a number of Chinese companies, including SINOchip. Founded in 2013, Sinocore believes that it can realize technology digestion, absorption and innovation in a few years.

Alpha department — Shenwei

Shenwei processor or Shenwei CPU, referred to as “SW processor”.

On the basis of dual-core Alpha, Shenwei has expanded the multi-core architecture and SIMD and other characteristic extended instruction sets, mainly for high-performance computing and server fields. In the 2016 International Supercomputer Conference, The Sunway Taihulight supercomputer system (shown in Figure 1-9), based on the Sunway 26010 processor, made its debut and won the title with a peak performance of 12.5× 10.8 billion floating-point operations per second, making it the world’s first supercomputer with an operation speed exceeding 10.9 billion times.

ARM – Feiteng, Huawei Hsi, Spreadtrum and Huaxitong

To better understand the contents of this section, it is necessary to introduce the ARM authorization mode. In short, ARM’s main licensing model can be divided into two types.

  1. License “ARM processor IP” to other chip manufacturers (partners) who directly design SoC chips using ARM processor IP.
  2. License the “ARM Architecture” to other chip manufacturers (partners) who develop their own processor cores based on the ARM architecture and then design SoC chips using their own processor cores.

1. feiteng

Feiteng is an enterprise established by the high-performance processor research team of National University of Defense Technology of China, which has accumulated strong technical strength in the FIELD of CPU for many years. In 2016, Tianjin Feiteng announced its latest product, FT2000, which first appeared in the HotChips Conference in 2015, code-named “Mars”, targeting at high-performance servers and industrial business hosts. FT2000 uses the ARMv8 instruction set, but uses a proprietary kernel, unlike the ARMv8 cortex-A53 \A57\A72 (purchased directly from ARM).

The FT2000 also stands out because of its performance, including a whopping 64 FTC661 processor cores, which in published Spec 2006 tests scored integer 672 and floating point 585, putting it on par with the Xeon E5-2699V3. This is the first time that the domestic server chip has caught up with Intel in performance. The total aggregation bandwidth of the memory control chip is 204.8GB/s, exceeding the current E5V3 and E7V3 and approaching IBM POWER8 (230GB/s). Running scores comparable to Intel’s Xeon E5-2699V3 mean that the Feiteng 2000 is perfectly adequate for many commercial applications and could replace some Intel products in the commercial market if the software ecosystem keeps up.

2. Huawei haisi

Huawei Haisi is currently one of the most technologically powerful chip developers in China. Huawei’s Kirin chips are on a par with leading chipmakers such as Qualcomm and Samsung in terms of performance. At the same time, Huawei is also one of the four major domestic server providers, huawei, Lenovo, Inspur and other domestic server enterprises occupy more than 65% of the Chinese server market share. Huawei has licensed the ARM instruction set architecture several years ago and started developing its own processor cores, focusing on the server market.

At the “Twelfth Five-year plan” Scientific and Technological Innovation Achievement Exhibition, Huawei exhibited its first ARM platform server “Taishan”, equipped with self-developed ARM architecture 64-bit processor “Hi1612”, using TSMC 16nm process, with up to 16 cores, compatible with ARMV8-A instruction set. With huawei’s strong research and development strength and market operation capability, it is believed that it will perform well.

3. spreadtrum

Besides Huawei, Spreadtrum is another domestic leader in mobile phone chips. In 2016, Spreadtrum shipped 67 million sets of chips. In June 2017, Spreadtrum announced the successful development of its own ARM architecture processor. Spreadtrum claimed to have achieved a six-core design on the same area of the SC9850 4-core (Cortex-A7) chip, and power consumption and performance can be customized according to its own needs. This marks Spreadtrum becoming the second mobile phone chip manufacturer to own the key technology of its own ARM CPU after Qualcomm, in addition to Apple and Samsung, which mainly use their own chips for their own use.

4. Tong hua core

In 2016, Qualcomm set up a chip company in China, Huaxintong Semiconductor, in a joint venture with the government of Guizhou province, to design and develop dedicated chips for servers specifically for the Chinese market. China core through an authorised person of the ARM v8 – A framework, and said China has become the world’s second largest data center market, the authority will help China core semiconductor in the rapidly expanding Chinese market advanced server chip technology, help Chinese enterprises in the domestic market to provide the server based on ARM technology, thus promoting efficient server solution for large-scale deployment.

Pan man ISA back

From the above chapters, we have seen the hero list of domestic CPU design. However, as mentioned above, there are still not many domestic cpus in the civil and commercial field. It can be said that the main reason why domestic processors have not been successful enough in the civil and commercial fields is ISA, which will undoubtedly be carried back.

This paper discusses the importance of instruction set architecture (ISA) for CPU, so for a CPU, absolute hardware technology level is not the most important.

At present, the commercial mainstream instruction set architecture has appeared obvious hegemonic pattern in different fields.

  • The x86 architecture dominates the desktop PC and server world.
  • The ARM architecture dominates the mobile handheld space and is making inroads into the desktop PC and server space.
  • ARM dominates the embedded space.

Therefore, the author has always believed that only commercial companies attached to the x86 and ARM camps can truly achieve full commercialization. I believe this is why in recent years, the hero list of domestic CPU design emerged mostly x86 or ARM system.

However, domestic independence is of vital importance to China’s national economy and people’s livelihood, and the pursuit of safe and controllable domestic independence is the direction that China must adhere to strategically. From this point of view, the choice of x86 or ARM architecture also has its limitations, which are discussed below.

1. X86 architecture

· Since Intel and AMD are themselves chip companies rather than intellectual property (IP) companies, x86 architecture is their lifeline. If chips produced by other licensed chip companies using x86 architecture pose a real threat to Intel and AMD, Intel and AMD can completely pick up the patent stick and stop licensing.

· Licensing fees for the x86 architecture are extremely high and far beyond the reach of ordinary companies or organizations.

2. ARM architecture

· The situation of ARM architecture will be much more optimistic, because ARM architecture belongs to ARM and is protected by patents, but its business model is based on the basic principle of openness and win-win. ARM company is the leader of ARM ecology and the maker of core rules, and obtains economic benefits through infrastructure authorization and IP core authorization. However, a large number of upstream and downstream software and hardware enterprises in the ecosystem follow the unified standards and specifications formulated by ARM to meet the needs of numerous customers and achieve economic benefits.

, domestic based on ARM CPU industrial ecology has better foundation, huawei haisi, spreadtrum, core and the fit, and many other companies have accumulated years of experience in research and development of ARM chips, chip design technology in China has been in the field of mobile terminals and the mainstream international level synchronization, foreign giant qualcomm, samsung and Google also belong to the members of the ARM ecosystem camp. Therefore, from a global perspective, chip companies at home and abroad can compete fairly in an open and win-win environment. For these reasons, the achievements of the CPU companies using ARM architecture in the domestic CPU hero list are all the more noteworthy.

· In spite of this, ARM architecture belongs to ARM company after all. On the one hand, it needs to pay extremely high licensing fees (tens of millions of DOLLARS at a time) for ARM Company; on the other hand, after being acquired by SoftBank, ARM now belongs to a Japanese company. Therefore, from the point of view of absolute autonomy and control, it is inevitable to be disciplined by others.

The so-called “success is nothing, failure is nothing”, reading this, the reader may ask, is there not an ISA with the following characteristics?

(1) It is open source and shared, and does not belong to a commercial company, so there is no danger of being controlled by others and controlled independently, and there is no need to pay high licensing fees to commercial companies.

(2) It takes openness and win-win as the basic principle, and has a unified non-profit organization as the leader and core rule maker. Any company or individual can use its architecture for free forever.

  • A large number of upstream and downstream hardware and software enterprises in the ecosystem should follow the unified standards and specifications formulated by the organization and meet the needs of numerous customers to achieve economic benefits.
  • Globally, domestic and foreign chip companies can compete fairly in this open and win-win environment.

I believe that many people are like the author, in a long period of time, very much looking forward to the emergence of such a kind of ISA, the industry even appeared to hope that the country led to designate a national standard ISA, so as to unify the voice of domestic CPU ISA factions. However, the national standard ISA, which is confined within the scope of a country, is bound to be incompatible and impossible to succeed in the current trend of globalization. So everyone thought that such an ISA was impossible, and the author, a veteran of CPU design, had to write a poem to express his feelings at that time: “Dead yuan knows everything is empty, but sad not to see the same thing. Wang division north set zhongyuan day, without forgetting to tell is weng “.

In 2016, however, a new student named RISC-V suddenly came into the spotlight. It fully meets the two conditions mentioned above. It belongs to the free and open architecture of all mankind, without any shackles of patents. Many international well-known companies have joined it, and will compete fairly in an open and win-win environment. The author has a vague feeling that if the ISA does take off, this might be a real opportunity for the rise of home-grown cpus. Just now, we mentioned that it was suggested to develop a national standard instruction set architecture. However, when RISC-V came into being, our neighboring Country, India, quickly adopted RISC-V as its national standard instruction set and recommended that its universities and research institutions adopt RISC-V architecture. Plans have been made and funds dedicated to the development of several different families of RISC-V processors.

There is a saying that “at the end of the tunnel, there is no way out”, about the nascent RISC-V architecture.

Life is so difficult, why do you expose – CPU practitioners helpless

For ordinary workers in every industry, they want their industry to flourish and prosper, with the participation of a large number of commercial companies and the demand for a large number of jobs. If you work in an industry that is either dying out or oligarchised into a backwater, there is not much demand for jobs, and the average worker is likely to be “searching, lonely and miserable” or “married to a merchant’s wife at home with few horses in the saddle”.

Processor design is a classic example. Although processor design is an open discipline, the required technologies are mature, and many engineers and practitioners have mastered and developed processors. But:

· As the processor architecture has been dominated by commercial giants represented by Intel (x86 architecture) and ARM (ARM architecture) for a long time, the oligarchic exclusivity effect derived from the software ecosystem has become a barrier that ordinary companies and individuals cannot overcome.

, due to the exclusive effect of oligarchs, many dying processor architecture, domestic commercial CPU cannot successful enough, causing the CPU design the work into a few commercial companies “hall before the swallow,” common people “can only far view, and not to use how to play, not for a long time form the relevant industries and commercial company that has enough influence.

To sum up, the author, as a senior CPU design engineer who once worked in a world-class company, was faced with the dilemma of having no career when changing jobs, and lamented the situation that many colleagues were forced to change careers. It can be said that “song gao and few, big sound of sparse sound”, CPU design practitioners, quite helpless also. At this point, colleagues who have been forced to change careers may be in tears: “Life is so hard, why should you try to explain it away?”

The good news is that in recent years the domestic CPU industry finally happened change, due to China’s huge market and industry support, the domestic emerged as we mentioned in section of the signs in the core, the fit, huawei, spreadtrum, sea ray and tong hua core CPU design company, and as “taught you how to design the CPU” introduces the birth of RISC architecture – V, Will generate more market demand.

Sunrise in the east and rain in the west, the road is sunny but not sunny – risC-V enter

The RISC-V architecture was developed in 2010 by Berkeley’s Krste Asanovic, Andrew Waterman, and Yunsup Lee, and is supported by David Patterson, a leading figure in the field of computer architecture. The reason the Berkeley developers invented a new instruction set architecture, rather than using mature x86 or ARM architectures, was that these architectures had become extremely complex and cumbersome over the years, and had costly patent and architecture licensing issues. And modifying RTL code for ARM processors is not supported, while source code for x86 processors is simply not available. Other open source architectures (such as SPARC and OpenRISC) have more or less problems (more on that in Chapter 2). The idea of computer architecture and instruction set architecture had been well developed for decades, but a research institution like Berkeley could not choose a suitable instruction set architecture for its use. Berkeley professors and developers decided to invent a new, simple, open and free instruction set architecture, and risC-V architecture was born.

For information on the birth of RISC-V, please check out the article “Berkeley Hopes to Take RISC-V Open Source Architecture mainstream” on the Web.

Risc-v (risk-five) is a new instruction set architecture. “V” contains two meanings. First, it is the fifth generation instruction set architecture designed by Berkeley starting from RISC I. The other is that it represents Variation and Vectors.

After several years of development, Berkeley developed a complete software tool chain and several open source processor instances for RISC-V architecture, which gained more and more people’s attention. In 2016, RISC-V Foundation (Foundation) was formally established and started operation. The RISC-V Foundation is a non-profit organization responsible for maintaining the standard RISC-V instruction set manuals and architecture documentation and promoting the development of the RISC-V architecture.

The goals of risC-V architecture are as follows.

  • A fully open instruction set that can be used freely by any academic institution or commercial organization.
  • Become a truly suitable hardware implementation and stable standard instruction set.

The RISC-V Foundation is responsible for maintaining the standard RISC-V architecture documentation, compilers and other software tool chains required by cpus, which can be downloaded free of charge by any organization or individual at any time on the RISC-V Foundation website (no registration required).

The launch of RISC-V and the establishment of the Foundation have been greatly welcomed by academia and industry. Linley Group, a leading technology industry analyst, named RISC-V as the “Best technology of 2016,” as shown in Figure 1-12.

The emergence of open and free RISC-V architecture is not only good news for universities and research institutions; Lack of funds for early startup, extremely sensitive or cost of products, or to the existing software ecosystem relies on small areas, provide an alternative, and obtained the industry major technology companies, including Google, HP, Oracle and western data such as silicon valley giants are RISC – V foundation’s founding member, as shown in figure 1 to 13. Many chip companies are already using risC-V (Samsung, Nvidia, etc.) or are planning to use RISC-V to develop their own processors for their products.

The RISC-V Foundation organizes two public workshops each year to promote communication and development of the RISC-V camp. Any organization or individual can download the PPT and documents presented at each Workshop from the RISC-V Foundation website. The sixth RISC-V Workshop was held in Shanghai Jiaotong University in China in May 2017, as shown in Figure 1-14, which attracted a large number of Chinese companies and enthusiasts to participate.

Simplicity is Beauty — the design philosophy of RISC architecture

Before getting into the details of risC-V architecture as an instruction set architecture, let’s understand the philosophy of design. The so-called “philosophy” of design is a strategy advocated by it. For example, we are familiar with the design philosophy of Japanese cars is economy and fuel economy, and the design philosophy of American cars is arrogance. What is the design philosophy of risC-V architecture? It is “simplicity to the high road”.

One of the design philosophies advocated by the author is that simplicity is beauty, and simplicity means reliability.

Countless practical cases have proved the truth that “simplicity means reliability”, whereas the more complex the machine is, the more prone to error. One of the best examples is the famous AK47 submachine gun, which has become the most widely used individual weapon in the world due to its simple and reliable design philosophy.

In the field of fighting, beginners tend to fall into the pursuit of complicated and complicated skills of the mire, superstition in boxing. The best fighters, however, end up using simple, direct moves.

The so-called avenue to simplicity, in the actual work of IC design, the author has seen the simple design to achieve its safety and reliability, and also seen the complex design for a long time unable to stabilize convergence. Simple design is often reliable and is tested time and again in most projects. The work nature of IC design is very special, its final output is the chip, and the design and manufacturing cycle of a chip is very long, cannot be upgraded and patched as easily as software code, each chip revision to delivery takes several months cycle. Moreover, chips are expensive to make, ranging from hundreds of thousands of dollars to millions of dollars. These characteristics make IC design very expensive to try and error, so it is important to effectively reduce the occurrence of errors. Modern chip design is becoming more and more large-scale and more and more complex. It does not require designers to avoid using complex technology blindly, but to use good steel on the blade, use the most complex design in the most critical scenes, and choose simple implementation schemes as far as possible in most cases of choice.

The author was amazed when he first read the RISC-V architecture documentation. Because risC-V architecture in its documentation repeatedly explicitly emphasizes its design philosophy of “simplicity”, it strives to make hardware implementation simple enough through the definition of architecture.

It is simply the philosophy of beauty, which can be seen in several ways, and will be discussed in subsequent sections.

No disease all light — the length of architecture

Readers familiar with ARM’s architecture documentation should be aware of its length. After decades of development, x86 and ARM architectures now contain thousands of pages of architectural documents that can be printed half a desk high.

Presumably the x86 and ARM architectures weren’t as extensive when they were born. One of the main reasons why the architecture document is thousands of pages long and has many versions is that the development process of the architecture is accompanied by the continuous development and maturity of modern processor architecture technology, and as a commercial architecture, in order to maintain backward compatibility of the architecture, many outdated definitions have to be retained. Or it may be awkward to define new architectural parts that are compatible with existing technical parts. As time passed, it became the foot-binding of an old woman — extremely long and unrecoverable.

Can a modern, mature architecture choose to start over and redefine a clean architecture? Almost impossible. Intel also gave up forward compatibility when it launched Itanium architecture. In the end, Intel’s Itanium failed miserably. One of the important reasons was that it could not be accepted by users because of its lack of forward compatibility. If we buy a computer or phone with a new processor and all the previous software doesn’t work, it’s definitely unacceptable.

Now the RISC-V architecture has the advantage of being a late mover. Because computer architecture has become a mature technology after years of development, the problems exposed in the process of continuous maturation have been thoroughly studied, so the new RISC-V architecture can be avoided, and does not bear the historical burden of backward compatibility, so it can be said to be free from disease.

The current “RISC-V architecture document” is divided into “instruction set document” and “privileged architecture Document”. The Instruction set document is over 100 pages long, and the Privilege Schema document is only about 100 pages long. Engineers familiar with architecture can read through it in a day or two, and although the “RISC-V architecture documentation” continues to grow, it is extremely short compared to the “architecture documentation for x86” and “architecture documentation for ARM.”

Interested readers can go to the RISC-V Foundation’s website and download the documentation for free without registration, as shown in Figure 1-1.

Architecture documentation on the RISC-V Foundation website

Flexible and extendable – modular instruction set


Risc-v architecture is different from other mature commercial architectures in that it is a modular architecture. As a result, the RISC-V architecture is not only short and compact, but its different parts can be organized together in a modular manner, attempting to satisfy a variety of applications through a unified architecture.

This kind of modularity is absent from x86 and ARM architectures. Take ARM architecture as an example. ARM architecture is divided into A, R and M, which are incompatible with each other and respectively targeted at Application, real-time and Embedded fields. However, the modular RISC-V architecture allows users to flexibly choose different modules to combine to meet different application scenarios, which can be said to be “suitable for all ages”. For example, for embedded scenarios with small area and low power consumption, users can choose RV32IC combined instruction set and only use Machine Mode. For high-performance application operating system scenarios, you can select the instruction set such as RV32IMFDC and use the Machine Mode and User Mode modes.

What is condensed is the essence — the number of instructions

The short architecture and modular philosophy make risC-V architecture’s instruction count very concise. The number of basic RISC-V instructions is only more than 40, plus other modular extension instructions a total of dozens of instructions. Figure 2-2 is A RISC-V instruction set card. See Appendix A for more information about risC-V instruction set.

Risc-v instruction set chart card

What’s the book about?

“Teach you how to Design CPU” is a systematic and comprehensive introduction to RISC-V architecture in very easy to understand language, and combined with the hummingbird E200 series open source processor check CPU design technology is explained in a simple way, with vivid pictures and pictures. It reflects the author’s profound professional skills and excellent ability to express the professional knowledge in a popular way. Impressively, the author adds a great deal of background reading and personal notes to his introduction to risC-V architecture, which makes boring technical knowledge very accessible and valuable. This is an elaborate book based on what the author has learned for many years. It is well worth reading and will greatly promote the spread of RISC-V architecture in China. As a rare introduction to RISC-V in Chinese, this book is sure to become a classic in this field.

The contents of this book

The first part is an overview of CPU and RISC-V

Chapter 1: The three Lives of CPU 2

1.1 Watch him build tall buildings, watch him feast, watch his building collapse — CPU of all people 3

1.1.1 ISA — Soul of CPU 4

1.1.2 CISC and RISC 5

1.1.3 32-bit and 64-bit Architectures 6

1.1.4 ISA Biogenesis 6

1.1.5 CPU Fields 10

1.2ISA please carry the pot – why domestic cpus have not been successful enough 12

1.2.1 MIPS series — Loongson and Junzheng 12

1.2.2 X86 series — Beidazhong Zhi, Zhaoxin and Haiguang 13

1.2.3 Power System — Sinochem 13

1.2.4 Alpha — Shenwei 14

1.2.5 ARM series — Feiteng, Huawei Hisi, Spreadtrum and Huaxitong 14

1.2.6 Pot Man ISA 15

1.3 Life is so difficult, why do you expose – CPU practitioners helpless 17

How lonely it is to be invincible — ARM rules the world 18

1.4.1 Sole lele and All Lele — Profit model of ARM

1.4.2 Small Men have Great Power — the ubiquitous Cortex-M Series 21

1.4.3 Mobile King — The Huge success of the Cortex-A series in handheld devices23

1.4.4 Advancing giant — ARM’s ambition to enter PC and server field 25

1.5 Sunrise in the east and rain in the west, the road is sunny without sunshine — RISC-V on stage 25

1.6 So you are such a “potato chip” – ARM’s free plan 28

1.7 You can design your own processor

Chapter 2 Avenue to Simplicity — The Soul of RISC-V Architecture 29

2.1 Simplicity is Beauty — RISC-V architecture design

Philosophy of 30

2.1.1 Free from disease — The length of the framework 30

2.1.2 Flexible and extendable — modular

Instruction set of 32

2.1.3 What is concentrated is the essence — the number of instructions 32

2.2 Introduction to risC-V Instruction set Architecture 33

2.2.1 Modular instruction subset 33

2.2.2 Configurable general register group 34

2.2.3 Regular instruction code 34

2.2.4 Concise memory access instruction 34

2.2.5 Efficient Branch hop instruction 35

2.2.6 Concise subroutine call 36

2.2.7 Unconditional Code Go to 37

2.2.8 No Branch Delay Slot 37

2.2.9 Zero Cost Hardware Loop 38

2.2.10 Concise operation instruction 38

2.2.11 Elegant subset of compression instructions 39

2.2.12 Privileged Mode 40

2.2.13 CSR register 40

2.2.14 Interruption and Exception 40

2.2.15 Vector instruction subset 40

2.2.16 Custom Instruction Extension 41

2.2.17 Summary and comparison 41

2.3RISC-V Software Tool Chain 42

2.4 How risC-V differs from other open Architectures

2.4.1 Civilian Hero — OpenRISC 44

2.4.2 SPARC 44

2.4.3 Excellent Students from prestigious universities — RISC-V 45

Chapter 3: RisC-V commercial and open source 46

3.1 Review of commercial and open source versions 47

3.1.1 Rocket Core 47

3.1.2 BOOM Core (Open Source) 49

3.1.3 Freedom SoC 50

3.1.4 LowRISC SoC (open source) 50

3.1.5 PULPino Core and SoC 50

3.1.6 PicoRV32 Core (Open Source) 51

3.1.7 SCR1 Core (Open Source) 51

3.1.8 ORCA Core (Open-source) 51

3.1.9 Andes Core (Commercial IP) 52

3.1.10 Microsemi Core (Commercial IP) 52

3.1.11 Codasip Core (Commercial IP address) 53

3.1.12 Hummingbird E200 Core & SoC (open source) 53

3.2 summary of 53

Chapter 4 China’s first open source RISC-V — Hummingbird E200 series ultra-low power Core & SoC 54

4.1 Distinctive Hummingbird E200 processor 55

4.2 Introduction to Hummingbird E200 – Hummingbird is small,

Complete the five viscera

4.3 Hummingbird E200 model series 57

4.4 Hummingbird E200 performance index 58

4.5 Hummingbird E200 with SoC 59

4.6 Hummingbird E200 Configuration Option 60

The second part teaches you how to use

The Verilog design CPU

Chapter 5 foresee forest, Backview Tree — Hummingbird E200 design overview and top floor introduction 65

5.1 Overview of Processor Hardware Design 66

5.1.1 Architecture and Microarchitecture 66

5.1.2 CPU, Processor, Core, and

Processor core 66

5.1.3 Features of processor design and verification 66

5.2 Hummingbird E200 Processor core Design Philosophy 67

5.3 Hummingbird E200 processor core RTL code style

Introduced in 68

5.3.1 Using standard DFF module

Register 68

5.3.2 You are advised to use assign instead of if-else and case 70

5.3.3 Other Matters needing attention 71

5.3.4 summary 72

5.4 Layer Division of hummingbird E200 Module 72

5.5 Hummingbird E200 processor core source code 73

5.6 Hummingbird E200 Processor Core Configuration Option 73

5.7 RISC-V supported by the Hummingbird E200 processor core

Instruction subset 74

5.8 Hummingbird E200 Processor Pipeline 74

5.9 This topic describes the Top-level Ports on the HUMmingbird E200 PROCESSOR core. 74

5.10 summary of 77

Chapter 6 Assembly Line is Not a Ledger — Hummingbird E200

Introduction to assembly Line 78

6.1 Overview of processor Pipelining 79

6.1.1 Starting from the classic five-stage assembly line 79

6.1.2 Can we not have assembly line — The relationship between assembly line and state machine 81

6.1.3 Deep planting of water chestnut and shallow planting of rice, neutral planting of lotus — the depth of the assembly line

6.1.4 Growing up — deeper and deeper

Line 82

Growing downward — shallower and shallower

Line 83

6.1.6. 83

6.2 Out of order in processor pipeline 83

6.3 Reverse voltage in the PROCESSOR Pipeline 84

6.4 Conflicts in processor pipelining 84

6.4.1 Resource Conflict in pipelines84

6.4.2 Data Conflict in pipeline85

6.5 Pipeline of hummingbird E200 processor 86

6.5.1 Overall structure of assembly line 86

6.5.2 Conflicts in the assembly line 87

6.6 summary of 87

Chapter 7 Are All things difficult before they begin — It all begins with instructions

7.1 Overview of Reference 89

7.1.1 Finger-taking features 89

7.1.2 How Can I Quickly Set Finger to 90

7.1.3 What do I Do with unaligned Instruction 91

7.1.4 How to handle branch instruction 92

7.2 Simplification of risC-V architecture features for reference 97

7.2.1 Regular instruction encoding format 97

7.2.2 Instruction Length Indicator code is placed at low 97

7.2.3 Simple Branch Hop Instruction 98

7.2.4 No branch delay slot instruction 100

7.2.5 Provide clear static branch prediction basis 100

7.2.6 Provide clear RAS basis 101

7.3 Realization of finger removal of hummingbird E200 processor 101

7.3.1 General Design Idea of IFU 102

7.3.2 Mini – Decode 103

7.3.3 Simple-BPU Branch Prediction 105

7.3.4 PC Generates 109

7.3.5 Access to ITCM and BIU 111

7.3.6 ITCM 115

7.3.7 bju international 116

7.4 summary of 116

Chapter 8 Execution is the key — execution 117

8.1 Overview 118

8.1.1 Instruction decoding 118

8.1.2 Execute command 118

8.1.3 Conflict of assembly line 119

8.1.4 Delivery of instructions 119

8.1.5 Sequence of instruction sending, dispatching, executing and writing back

8.1.6 Branch Resolution 121

8.1.7 summary 121

8.2RISC-V architecture features Simplification for execution 122

8.2.1 Regular instruction encoding format 122

8.2.2 Elegant 16-bit instruction 122

8.2.3 Number of simplified instructions 122

8.2.4 Integer instructions are all two-operand 123

8.3 Execution of the Hummingbird E200 processor 123

8.3.1 Executing instruction list 123

8.3.2 General Design Roadmap of EXU 123

8.3.3 are included decoding 124

8.3.4 Integer universal register group 130

8.3.5 CSR register 133

8.3.6 Command launch dispatch 134

8.3.7 Pipeline conflicts, long instructions, and OITF 139

8.3.8 ALU 145

8.3.9 High-performance Multiplication and division 157

8.3.10 Floating point unit 158

8.3.11 deliver 159

8.3.12 write back to 159

8.3.13 Coprocessor extension 160

8.3.14 summary 160

Chapter 9 A good beginning is much, but an end is little — deliver 161

9.1 Processor delivery, cancellation, flush 162

9.1.1 Introduction to Processor delivery, cancellation, and flushing 162

9.1.2 Common Implementation strategies for Processor Delivery 163

9.2RISC-V architecture features for delivery simplifications 164

9.3 Hummingbird E200 processor delivery hardware implementation 164

9.3.1 Processing of branch prediction instruction 165

9.3.2 Handling interrupts and Exceptions 168

9.3.3 Delivery of multi-cycle execution instructions 169

9.3.4 summary 169

Chapter 10 Let the Bullets fly for a while — go back to 170

10.1 Processor write back to 171

10.1.1 Processor Write Back 171

10.1.2 Processor Write Back to Common Policy 171

10.2 Hummingbird E200 processor write back hardware implementation 171

10.2.1 Finally write back to Arbitration 172

10.2.2 OITF and long Order write back to arbitration 174

10.2.3 summary 177

Chapter 11 Harvard or BYD — Memory Architecture 178

11.1 Overview of storage Architecture 179

Who says a processor must have a cache

11.1.2 The processor must have memory 180

11.1.3 ITCM and DTCM 182

11.2 RisC-V architecture features for the simplification of memory access instructions 183

11.2.1 Only the small-end format 183 is supported

11.2.2 No Address Autoadd/subtract mode 183

11.2.3 None Read multiple Data at a time and Write Multiple Data at a time command 183

11.3 Memory related instructions for RISC-V architecture 184

11.3.1 Load and Store directives 184

11.3.2 Fence Directive 184

11.3.3 “A” extends instruction 184

11.4 Hummingbird E200 processor memory subsystem hardware implementation 185

11.4.1 General design ideas of memory subsystem 185

11.4.2 AGU 186

11.4.3 LSU 190

11.4.4 ITCM and DTCM 192

11.4.5 “A” extended Instruction processing 195

11.4.6 Fence and fence. I instruction processing 200

11.4.7 bju international 202

11.4.8 ECC 202

11.4.9 summary 202

Chapter 12 Window of the Black Box – Bus interface unit BIU 203

12.1 Overview of on-chip Bus Protocols 204

12.1.1 AXI 204

12.1.2 AHB 204

12.1.3 APB, 205

12.1.4 TileLink 205

12.1.5 Summary and Comparison 205

12.2 Customizing bus Protocol ICB 206

12.2.1 ICB Bus Protocol Overview

12.2.2 ICB Bus Protocol Signal 207

12.2.3 ICB bus protocol timing 207

12.3 Hardware implementation of ICB Bus 210

12.3.1 One master has many slaves 210

12.3.2 Many masters and one Subordinate 211

12.3.3 Many Masters and Many Slaves 212

12.4 Hummingbird E200 processor core BIU 212

12.4.1 Introduction to BIU 212

12.4.2 BIU Microarchitecture 213

12.4.3 BIU source analysis 214

12.5 Hummingbird E200 processor SoC bus 214

12.5.1 Introduction to SoC Bus 215

12.5.2 SoC Bus Microarchitecture 215

12.5.3 SoC Bus source code analysis 216

12.6 summary of 216

Chapter 13 Stories that Have to be Told — Interruptions and Anomalies 217

13.1 Interruption and Exceptions 218

13.1.1 Interruption Overview 218

13.1.2 Exception Overview 219

13.1.3 Exceptions in the broad sense 219

13.2 RisC-V Architecture Exception Handling Mechanism 221

13.2.1 Entering Exception 221

13.2.2 Exit Exception 224

13.2.3 Abnormal Service Program 225

13.3 RisC-V Architecture Interruption226

13.3.1 Interrupt Type 226

13.3.2 Interrupt Masking 228

13.3.3 Interrupt Wait 229

13.3.4 Interrupt Priority and Arbitration 230

13.3.5 Interrupt Nesting 230

13.3.6 Summary comparison 231

13.4 CSR Related to RISC-V Architecture Exceptions

Register 232

13.5 Hardware implementation of Hummingbird E200 exception Handling 232

13.5.1 Exceptions and interrupts of hummingbird E200 processor Implementation Points 232

13.5.2 Abnormal Types of the Hummingbird E200 processor 233

13.5.3 MePC processing by hummingbird E200 processor 234

13.5.4 Interrupt port 234 of the Hummingbird E200 processor

CLINT microarchitecture and source code analysis of Hummingbird E200 processor 235

Microarchitecture and source code analysis of HUMmingbird E200 processor PLIC

13.5.7 Hummingbird E200 Processor Delivery module handling of Interrupts and exceptions 242

13.5.8 summary 245

Chapter 14 the least obvious, but actually the most difficult – the debug mechanism 246

14.1 Debugging Mechanism Overview 247

14.1.1 Interactive Debugging Overview 247

14.1.2 Tracing Debugging Overview 249

14.2 Debugging Mechanism of risC-V Architecture 249

14.2.1 Implementation of debugger software 250

14.2.2 Debug Mode 250

14.2.3 Debugging Instruction 251

14.2.4 Debugging mechanism CSR 251

14.2.5 Debugging Interruption 251

14.3 Hardware implementation of hummingbird E200 debugging mechanism 251

14.3.1 Hummingbird E200 Interactive Debugging Overview 251

14.3.2 DTM Module 253

14.3.3 Hardware Debugging module 253

14.3.4 Debugging Interrupt 257

14.3.5 Debugging the CSR register

To achieve 258

14.3.6 Implementation of debugging mechanism Instruction 258

14.3.7 summary 259

Chapter 15 Moving like a Rabbit, Quiet like a Virgin — low power consumption tips 260

15.1 Overview of Processor Low Power Technologies 261

15.1.1 Low Power Consumption at the Software Layer 261

15.1.2 System Layer Low Power 261

15.1.3 Low Power consumption at the Processor Layer 262

15.1.4 Low power consumption at the unit level 262

15.1.5 Low power consumption at register Level 263

15.1.6 Latch Layer Low power 264

15.1.7 SRAM layer Low power 264

15.1.8 Low power consumption at the Combined Logic Layer 264

15.1.9 Low power consumption at process level 265

15.2 Low-power mechanism of risC-V architecture 265

WFI order 265

15.3 Hardware implementation of hummingbird E200 low power mechanism 265

15.3.1 Hummingbird E200 System level low power 265

15.3.2 Hummingbird E200 Processor Layer Low power 267

15.3.3 Hummingbird E200 unit level low power consumption 269

15.3.4 Low power consumption at register level of Hummingbird E200

15.3.5 Hummingbird E200 latch level low power 272

15.3.6 Hummingbird E200 SRAM Layer Low power 273

15.5.7 Hummingbird E200 Low power consumption at combined logic level 274

15.5.8 Hummingbird E200 process level low power 275

15.4 summary of 275

Chapter 16 To do a good job, you must first sharpen the risC-V scalable coprocessor 276

16.1 Domain Specific Architecture DSA 277

16.2 Scalability of RISC-V Architecture 278

16.2.1 RISC-V Reserved Instruction Encoding

Room 278

16.2.2 RISC-V predefined Custom directive 279

16.3 Hummingbird E200 coprocessor interface EAI 279

16.3.1 Code of EAI instruction 279

16.3.2 EAI interface signal 280

16.3.3 EAI Pipeline Interface 281

EAI Memory interface 282

16.3.5 EAI Interface Sequence 283

16.4 Hummingbird E200 Coprocessor Example 286

16.4.1 Example coprocessor Requirements 286

16.4.2 Example coprocessor instruction 287

16.4.3 Example coprocessor implementation 288

16.4.4 Example Coprocessor performance 289

16.4.5 Example coprocessor code 290

The third part uses Verilog to simulate and run software on FPGA SoC prototype

Chapter 17 Smoke First — Run Verilog simulation Test 292

17.1E200 Open Source Project code hierarchy 293

17.2E200 Open Source Project Test Case 294

17.2.1 RisCV-Tests Example 294

17.2.2 Compiling ISA Self-Test Example 295

17.3E200 Test platform for open source projects

(TestBench) 298

17.4 Running Verilog TestBench

Test case 299

Chapter 18 Put on the shell and hit the road — Implementing SoC and FPGA prototypes

18.1Freedom E310 SoC Overview 303

18.2 HBird E200 – SoC profile, 304

18.2.1 Composition of Hbird-E200-SOC 304

18.2.2 Hbird-E200-SOC code structure 309

18.3 Hbird-E200-SOC FPGA prototype platform 311

18.3.1 FPGA Development board 311

18.3.2 Generate MCS file and burn FPGA 314

18.3.3 JTAG debugger 317

18.3.4 FPGA prototype platform DIY

Conclusion 320

18.4 Hummingbird E200 dedicated FPGA development board 320

Chapter 19 Finishing Touch — Running and debugging software Examples

19.1 Introduction to FREEDom-E-SDK Platform 322

Introduction to SIRV-E-SDK Platform

19.2.1 Introduction to SIRV-E-SDK 323

19.2.2 SIRV-E-SDK code structure 324

19.3 Run the sample program 325 using the SIRV-E-SDK

19.4 Debugging Examples using GDB and OpenOCD

Program 328

19.5Windows Graphical IDE Development Tools 331

Chapter 20 A Mule or a horse? Pull out for a walk – run score program 332

20.1 Introduction to the scoring program 333

20.2 Dhrystone profile of 333

20.3 Run Dhrystone Benchmark 335

20.4 CoreMark profile of 337

20.5 Run CoreMark Benchmark 338

Summary and Comparison 340

Appendix A RisC-V architecture instruction set introduction

Appendix B INTRODUCTION to RISC-V architecture CSR Register 374

Appendix C PLIC introduction of RISC-V architecture 384

Appendix D Memory model background introduction 392

Appendix E Background of memory atomic operation instructions 397

Appendix F LIST of RISC-V instruction codes 400

Appendix G RISC-V Pseudoinstruction list 404

Relevant books

Risc-v Processor Design

The Hu Zhenbo

(Published May 2018)


This book is an introduction to the general CPU design of the introductory book, in popular language system introduced CPU and RISC-V architecture, and strive to uncover the mystery of CPU design for readers, open the door of computer architecture.


The book is divided into four parts. The first part is CPU and RISC-V overview, help beginners to CPU and RISC-V quickly establish a understanding. The second part explains how to use Verilog to design the CPU, so that readers can grasp the essence of processor core design. The third part mainly introduces the SoC and software platform of hummingbird E203, enabling readers to realize the operation of HUMmingbird E203 RISC-V processor on FPGA prototype platform. The fourth part is an appendix, which introduces the RISC-V instruction set architecture, supplemented by background knowledge interpretation and annotations added by the author, so as to facilitate readers’ understanding.

Operating System True Restore

The Zheng Gang

Click on the cover to buy the paper book


A read understand, learn to be able, the author with humorous language about in-depth understanding of the operating system principle of the boutique book, after learning the reader can easily homemade operating system. Operating systems are not esoteric, and this book gives authoritative interpretation. It took 19 months, more than 600,000 words and more than 6000 lines of code to achieve a complete operating system.

This book with humorous language, the esoteric operating system as far as possible to explain clearly, readers in easy reading through the esoteric knowledge, after learning not only understand the operating system, readers can easily homemade an operating system, is a rare good book.

“Homemade Programming Language — Based on C”

The Zheng Gang

(Published May 2018)


“Pure manual” : no third party library and tools are required, fully understand the principle and implementation of each detail. The implementation is an object-oriented scripting language, which involves the implementation of virtual machines, giving readers a taste of the internal implementation of scripting languages.

Today’s interactive

How do you feel about designing your own CPU? The deadline is 17 o ‘clock, April 26,

Leave a message + Forward

This activity to the circle of friends, xiaobian will be lucky to select 5 readers
Free e-reader 100 yuan asynchronous community voucher, (the one with the most likes will automatically get one).



Recommended reading

A list of new books for April 2018

Asynchronous books the most complete Python book list

A list of essential algorithms books for programmers

The first Python neural network programming book

Long press the QR code, you can follow us yo

I share IT articles with you every day.


If you reply “follow” in the background of “Asynchronous books”, you can get 2000 online video courses for free. Recommend friends to pay attention to according to the prompts to get the gift book link, free asynchronous E reading version of a book. Come and join us!

Click to read the original article and directly purchase the operating System True Restore.

Read the original