Recently, I have received a lot of questions from backend students. Why is the Web framework of Go not as fast as Java? Why are so many original Java projects trying to rewrite open source with GO? Is Java doomed by the rise of containers? Is Java, a back-end evergreen for more than 20 years, really going downhill? Orange invited students from the technical department of Tao Department to answer the above questions, and we also welcome everyone to communicate with us.
Q: Why isn’t the Go Web framework as fast as Java?
** Wind chess: ** Huashan Lunjian, let’s simply run the performance analysis of each frame before we speak.
Different application scenarios of various frameworks lead to different emphasis on optimization, and we will carry out detailed analysis below.
Summary of the HTTP server
Start by describing a simple Web server request processing:
Net layer after reading the data packet through HTTP Decoder parsing protocol, and then through Route to find the corresponding Handler callback, processing business logic after setting the corresponding Response status code, and then HTTP Encoder encoding the corresponding Response. Net finally writes out the data.
While the layer under Net is controlled by the kernel, although there are many optimization strategies, but mainly compared to the Web framework itself, so the optimization under Net is not considered for the time being.
Looking at the source code of the pressure measurement framework provided by Techempower, all kinds of frameworks are basically based on epoll processing, so the performance gap of all kinds of frameworks is mainly reflected in the performance of these modules.
A brief description of various pressure measurements
Let’s look at techempower’s performance rankings, There are rankings of JSON Serialization, Single Query, Multiple Queries, Cached Queries, Surfers’ Fortunes, Data updates and Plaintext.
JSON serialization encodes a fixed JSON structure and returns (message: Hello word), Single Query is a Single DB query, Multiple queries are Multiple DB queries, Cached queries are retrieved from the memory database and returned as JSON, surfers are returned after page rendering. Data updates are writes to the DB, and Plaintext is the simplest way to return a fixed string.
Here, JSON encoding, DB operation, page rendering and fixed string return are the corresponding business logic. When the business logic becomes heavier (time-consuming), the corresponding business logic gradually becomes a bottleneck. For example, DB operation is mainly to test the performance of the corresponding DB library and DB itself. The cost of basic functions of the framework itself will be increasingly ignored as the business logic becomes heavier. (In Round 19, QPS in Plaintext for physical machines are at the 7 million level, while Data updates are at the 10 thousand level.) Therefore, we mainly analyze the ranking of Json Serialization and Plaintext, which can comparatively reflect the HTTP performance of the framework itself.
Firenio-http-lite is a Java framework with the highest performance in Round 19 Json Serialization (QPS: Fasthttp – EasyJSON-Prefork (QPS: 1,336,333) is the highest in Go, according to the data here is Java high performance.
In addition to read and write, JSON (equivalent to Business Logic) accounts for 4.5%. Fasthttp itself (HTTP Decoder, HTTP Encoder, Router) accounts for 15%. Just looking at Json serialization, it seems that Java has better performance than Go.
Plaintext mode is actually tested using HTTP Pipeline mode. In Round 19, Java and Go have almost the same QPS. In a test after Round 19, GNET came in second place among all languages, but the QPS of the first two frameworks were actually very different.
In fact, the main bottleneck at this time is in the NET layer, and go official NET library contains goroutine related logic, such as GONet direct operation of epoll will be less consumption in this aspect, Java NIO is also direct operation of epoll.
AppendFormat takes up 30% of the CPU.
You can use the following pre-format, which allows you to reduce the cost of getting the current time accuracy significantly.
var timetick atomic.Value func NowTimeFormat() []byte { return timetick.Load().([]byte) } func tickloop() { timetick.Store(nowFormat()) for range time.Tick(time.Second) { timetick.Store(nowFormat()) } } func nowFormat() []byte { return []byte(time.Now().Format("Mon, 02 Jan 2006 15:04:05 GMT")) } func init() { timetick.Store(nowFormat()) go tickloop() }Copy the code
After this optimization, the next bottleneck is the memory allocation of the Runtime, because there are still the following parts of the compression code that do not reuse memory:
In fact, the consumption of gnet itself has been very small, and c++ ulib is similarly using very simple HTTP codec operations to pressure.
Analysis of the
For the framework of this test, the main influencing factors are as follows:
1. Simple HTTP directly based on epoll: no full HTTP decoder and route (e.g. Gnet, ulib direct simple byte concatenation, fixed route handler callbacks)
Zero Copy and memory overcommitment: Internal handling of zero copies of bytes (go’s official HTTP library does not use Zero Copy to reduce error rates for developers who might inadvertently reference data that has already been put back into the buff pool causing unrealized concurrency issues, etc.), while memory overusing is something most frameworks already do, more or less.
3, prefork: Note that the Go framework uses prefork processes (fasthttp-prefork, for example), which fork multiple children and share the same Listen FD. The use of single-core but concurrent (1 P) logic for each process avoids lock contention and goroutine scheduling costs within The Go Runtime (although there are still some “garbage” code costs associated with concurrency and Goroutine scheduling).
4. Performance differences in the language itself
For the first point, simplification of various codecs and routing improves the performance, but often reduces the ease of use of the framework. For general services, such a high QPS will not occur. At the same time, ease of use and scalability should also be considered when selecting a framework. Also consider the integration complexity of frameworks used by existing middleware or SDKS within the company.
For the second point, as a network agent, without the development of the business side, it can often use the real complete Zero Copy, but as a business development framework to provide it needs to consider the probability of business error, often sacrifice part of the performance is cost-effective.
The third point is that prefork, Java Netty and so on are directly for thread operation, which can be more customized to optimize performance, while Go goroutine needs a general coroutine, which is to reduce the difficulty of writing concurrent programs. At this level, performance is inevitably inferior to that of a very well-optimized Java thread-based framework; While handling threads directly requires controlling the number of threads, which can be a headache for tuning (especially for beginners), Goroutine allows code to be more elegant and concise regardless of pool size, which is an improvement in project quality. In addition, prefork exists because GO can’t manipulate threads directly, and FasthttP provides prefork capabilities to further improve performance by using multithreading in Java in a multi-process manner.
Fourth, Java is more mature in terms of the language itself, including the Jit capability of the JVM, which makes the difference between hot code and Go compiled languages not very much. Moreover, Go’s compiler is not particularly mature, such as escape analysis and other issues. Go’s own memory model and GC are also less mature than Java. It’s also important to note that the Go framework is not at the same level of maturity as Java, but these will mature over time.
In summary, the significance of pressure measurement data for this framework is to understand the performance ceiling and determine the space and ROI for continued optimization. The choice of framework depends on usage scenarios, performance, ease of use, scalability, stability, and the internal ecosystem of the company. Language and performance are just one of the factors.
The different application scenarios of various frameworks lead to different optimization priorities. For example, Spring Web sacrifices performance for ease of use, scalability, and stability, but it also has a large community and users. In the Service Mesh Sidecar scenario, Go’s advantages in natural concurrent programming, small memory footprint, fast startup, and compiled language make it more suitable than Java.
(with: I actually built with the code above and dockerfile, and I used the same compression script, The performance of Go Fasthttp-EasyJSON-Prefork framework Json Serialization is 30% higher than That of Java Wizzardo-HTTP and Firenio-HTTP-Lite under alicloud 4 core’s unique machine test Above and lower latency, which may be kernel related).
Q: Why are so many original Java projects trying to rewrite open source with GO?
Empty mon: Java versus Go core is an ecological question.
Ecological development will go through several stages of starting, development, prosperity, stagnation and extinction. Java is still at least in the prosperity stage, while GO is still in the development stage. Different stages have huge differences in the quantity and quality of developers, the richness of open source ability and engineering support. In addition, different companies also have a small internal ecological stage, which will affect the selection of technology judgment.
The popularity of GO at the present stage is largely due to the fact that the cloud native brings us forward, and the k8S operator GO language has its own halo. Various middleware capabilities are sinking and integrating with K8S, driving a wave of GO implementation of basic middleware capabilities, but the basic middleware capabilities are relatively limited. Such as RPC, Config, Messagequeue, these middleware capabilities, as well as cloud native K8S for the upper business should do is to develop language neutrality, let the business based on the small ecology of the company and the whole language technology to choose, if forced business also use go language development that is playing rogue.
To sum up, the integration of basic middleware capabilities with K8S needs to be motivated by go, but other capabilities of the entire open source ecosystem are not necessarily necessary; Business development according to the company’s ecology and technology ecology to choose the most appropriate development language, do not blindly follow and lead to embarrassment in people, open source ability, engineering support. Whether go language can be used in business research and development remains to be further developed.
Q: Is Java doomed by the rise of containers?
Xuanli: In recent years, the container-centered cloud native technology has greatly improved the scalability and collaboration of server deployment. The importance of selecting the original development language itself has been weakened to a certain extent. But the Java language itself remains vibrant.
After all, as far as r&d is concerned, r&d output efficiency is also a key consideration, thanks to Java’s perfect and large developer ecosystem, which provides a richer class library/framework than most languages, and Java’s powerful IDE tools, which often get more results with less effort.
There are also some variants of Java itself (such as Scala) that are becoming more flexible and useful.
On the other hand, in big data, Java continues to shine, as we know it: ES, Kafka, Spark, Hadoop.
When we evaluate and predict the viability of a technology, we tend not to look at the technology alone, but at the whole ecology behind it. A technology with strong vitality is often supported by a mature ecosystem. As mentioned above, Java has a complete and huge ecosystem in many fields. Therefore, we believe that Java’s vitality is still strong.
But due to well-known reasons, objectively speaking, Java itself in the use of, will also have certain limitations. Also, in container scenarios, the memory configuration of Java processes needs to be carefully configured.
In general, Java is still very much alive and well in the cloud native scene.