Original address: ferrous-systems.com/blog/rust-a…

Original author: [email protected]

Published: October 28, 2020

In this article, we will compare an apple (IntelliJ Rust) and an orange (Rust-Analyzer) to draw general and comprehensive conclusions. Specifically, I would like to present a case study that supports the following claims.

For complex applications, Rust is just as productive as Kotlin.

To me, this is an unusual argument. I always thought it was the other way around, but I’m not so sure now. I came to Rust from C++. I think it’s an excellent low-level language, and I’m always puzzled by how people write high-level things in Rust. Obviously, choosing Rust means a productivity hit, and if you can afford GC, it makes more sense to use Kotlin, C#, or Go. My list of criticisms of Rust begins with this objection.

What shifted my stance in the other direction was my experience as the lead developer of Rust-Analyzer and IntelliJ Rust. Let me introduce these two projects.

IntelliJ Rust is a plug-in for the IntelliJ platform that provides Rust support. In effect, it is a front end to the Rust compiler, written in Kotlin and taking advantage of the platform’s language support. These include lossless syntax trees, parser generators, persistence, and indexing infrastructure. However, because programming languages vary widely, much of the logic for analyzing Rust is implemented in the plug-in itself. Presentation features like completion lists come from the platform, but most of the language semantics are written by hand. IntelliJ Rust also includes a bit of Swing’s GUI.

Rust-analyzer is an implementation of rust’s language server protocol. It is a Rust compiler front end written from scratch with IDE support in mind. It makes heavy use of the SALSA library for incremental calculations. In addition to the compiler itself, Rust-Analyzer includes code that manages long-running multithreaded processes in the language server itself.

The two projects are roughly the same in scope — a good fit for the IDE’s rusty compiler front end. The two biggest differences are.

  • IntelliJ Rust is a plug-in, so it can reuse code and design patterns from surrounding platforms.

  • Rust-analyzer is the second system, so it draws on IntelliJ Rust’s experience to design it from scratch.

The internal architecture of the two projects is also quite different. In terms of the three architectures, IntelliJ Rust is map-reduce, while Rust-Analyzer is query-based.

Writing a compiler suitable for an IDE is a high-level task. You don’t need to talk directly to the operating system. There are some fancy data structures and concurrency here and there, but they are also high-level. It is not about implementing crazy lock-free schemes, but about maintaining application state and sanity in a multithreaded world. The bulk of the compiler is symbolic manipulation, arguably best suited to Lisp. There is no inherent disadvantage in choosing a virtual machine-based language (such as OCaml) for such a task.

At the same time, the task is quite complex and unique. When implementing functionality, the ratio of “your code” to “framework code” is much higher than a typical CRUD back end.

Now that these projects are introduced, let’s look at two roughly comparable historical episodes.

  • Github.com/intellij-ru…

  • Github.com/rust-analyz…

Both projects are 2 years old, both have 1-1.5 full-time developers, and both have vibrant and thriving open source contributor communities. Kotlin has 52,000 lines and Rust has 66,000.

Both offered roughly the same set of features at the time. To be honest, I’m still not convinced 🙂 rust-analyzer starts from scratch, it doesn’t have a decade’s worth of Java classes to boot, and the productivity drop between Kotlin and Rust should be huge. But it is hard to argue with reality. Instead, let me try to reflect on my experience building both and try to explain Rust’s surprising productivity.

The learning curve

It’s easy to describe Kotlin’s learning curve — it’s almost zero. I started using IntelliJ Rust without Kotlin’s experience and never felt the need to learn Kotlin specifically.

By the time I moved to Rust-Analyzer, I was already experienced with Rust. I would say Rust definitely needs to be learned deliberately. It’s hard to just pick it up. Ownership and alias control are new concepts (even if you’re from C++), and it pays to take a holistic approach to learning them. After the initial learning steps, things generally go well.

By the way, this is the perfect place to promote our Rust courses and custom training. 🙂 the next Rust presentation will be in December of this year.

modular

I think that’s the most important factor. Both projects are modest in scope and amount of source code. I believe that the only way to ship big things is to break them up into individual pieces and implement those pieces separately.

I also find that most of the languages I’m familiar with suck at modularity. More broadly, I laugh at the FP vs. OO debate because it seems like “Why is nobody doing modules right?” Is a more prominent problem.

Rust is one of the few languages to have a first-class library concept. Rust’s code is organized on two levels.

  • Interdependent modules as a tree, in a crate

  • And directed acyclic graphs of crates

Circular dependencies can exist between modules, but not between plates. Plates are units of reuse and privacy: only the public API of a plate is important, and it’s pretty clear what the public API of a plate is. In addition, crates are anonymous, so there are no name conflicts and dependency hell when you mix several versions of the same crate in a crate diagram.

This makes it very easy to make two pieces of code independent of each other (which is the essence of modularity) : just put them in separate crates. Only changes to Cargo. Tomls need to be carefully monitored during code review.

At the time of comparison, Rust-Analyzer was divided into 23 internal crates, and several general-purpose crates. IntelliJ Rust, by contrast, is a single Kotlin module where everything can depend on everything else. While The internal organization of IntelliJ Rust is very clean, it is not reflected in the file system layout and build system and therefore requires constant maintenance.

Build system

Managing the build of a project takes a lot of time and multiplies everything else.

Rust’s build system, Cargo, is very good. It’s not perfect, but after Java’s Gradle, it’s a breath of fresh air.

The trick with Cargo is that it doesn’t try to be a general-purpose build system. It can only build Rust projects, and it has strict requirements for project structure. It cannot choose to deviate from its core assumptions. The configuration is a static, non-extensible TOML file.

Gradle, by contrast, allows free-form project structures that can be configured using the Turing complete language. I think I spent more time learning Gradle than Rust. After running WC-W, Rust has 182_817 words, while Gradle’s user guide is 280_506.

Also, Cargo was faster than Gradle in most cases.

Of course, the biggest drawback is that custom build logic is not expressed in Cargo. Both projects require a lot of logic, rather than pure compilation, to provide the user with the final result. For Rust-Analyzer, this is handled by handwritten rust scripts, which work perfectly at this scale.

The ecological system

Language-level library support and first-class build system/package manager enable the ecosystem to flourish. Some parts of rust-Analyzer are also published to crates. IO for reuse by other projects.

In addition, the low-level nature of the Rust programming language often allows for “perfect” library interfaces. These interfaces fully reflect the underlying issues without imposing an intermediate language-level abstraction.

Basic convenience

I feel Rust is significantly more productive when it comes to the basic language nuts and bolts — structures, enumerations, functions, and so on. This is not unique to Rust; any ML family of languages has these things. However, Rust is the first industrial language to encapsulate these capabilities in a nice package that is not constrained by backward compatibility. I want to list specific characteristics that I think make maintainable code faster in Rust

Focus on data, not behavior. That is, Rust is not an OOP language. The core idea of OOP is dynamic scheduling — which code is called by a function call is determined at run time (late binding). This is a powerful pattern that allows flexible and scalable systems. The problem is, scalability is expensive! It is best to apply it only in certain designated areas. It is not cost effective to design for scalability by default. Rust puts static scheduling front and center: just reading the code makes it clear what’s going on, because it’s independent of the run-time type of the object.

One little thing I like about Rust’s syntax is how it syntactically places fields and methods into separate blocks.

struct Person {
  first_name: String,
  last_name: String,}impl Person {
    fn full_name(&self) - >String{... }}Copy the code

Being able to see all the fields at a glance makes it much easier to understand the code. Fields convey much more information than methods.

Sum type. Rust’s modest enumerations are fully algebraic data types. This means that you can express unconnected ideas.

enum Either<A, B> { A(A), B(B) }
Copy the code

This is very useful in small, everyday programming, and sometimes in large programming. For example, one of the core concepts of an IDE is reference and definition. Like let foo = 92; This definition specifies a name for an entity, which can be used on the next line. A reference like foo + 90 refers to a definition. When you CTRL click on the reference, you will enter the definition.

The natural way to model in Kotlin is to add interface Definition and interface Reference. The problem is, some things are both

struct S { field: i32 }

fn process(s: S) {
    match s {
        S { field } => println!("{}", field + 2)}}Copy the code

In this case, the second field is both a reference to the field: i32 definition and the definition of a local variable named Field! Again, in the

let field = 92;
let s = S { field };
Copy the code

Field conceptually holds two references — one to a local variable and one to a field definition.

In IntelliJ Rust, this is usually handled by the special case of downgrade. In rust-Analyzer, this is handled by an enumeration that enumerates all special cases.

Rust-analyzer has so many enumerations that there’s a lot of code that just boring matches N variants, doing pretty much the same thing. This code is more verbose than IntelliJ Rust’s special case alternative, but easier to understand and support. You don’t need to keep a wider context in your mind to understand which particular situations are possible.

Error handling. Kotlin and Rust are mostly equivalent in practice when it comes to NULL security. There are some finer distinctions between the union type and the sum type, but in my experience they are irrelevant in real code. Grammatically, Kotlin is right? And? The processing of: feels more convenient more often.

However, when it comes to error handling (Result

rather than Option

), Rust wins. Having an error-commented path on the calling site is very valuable. Encoding errors in the return type of a function is appropriate for higher-order functions, making the code more robust. I’m afraid to call external processes in Kotlin and Python, because that’s where exceptions are common, and I forget to handle at least one case at a time.

,>

The battle with the loan checker

While Rust’s types and expressions generally allow people to state exactly what they want, there are still cases where a loan checker can get in the way. For example, here we cannot return an iterator that we want to borrow from temporary: utils.rs.

This kind of problem is very frequent when learning Rust. This is mainly because applying the traditional “soup of Pointers” design to Rust simply doesn’t work. Design-related loan-checker errors tend to fade with experience — it’s possible to build software as a tree of components, and it’s almost always a good design. The residual loan checker restrictions are annoying, but not important in the grand scheme of things.

concurrency

The IntelliJ Rust and Rust-Analyzer use a similar approach to deal with concurrency. There is a global read/write lock to protect the basic application state, and a large thread-safe cache to handle derived data.

It’s hard to manage in Kotlin. More than once, I asked myself, “Should I label this as volatile?” But there is no clear way to find out. To find out if something should be thread-safe in Kotlin, read the documentation and find all the ways to use it.

In contrast, “Is this type thread-safe?” This property is reflected in Rust’s type system (through the Send and Sync features). The compiler automatically deduces thread-safe and checks if non-thread-safe types are accidentally shared.

An error in IntelliJ Rust and Rust-Analyzer is a good example. To recap, both use caches shared between threads. In both projects, I designed an intelligent optimization that unfortunately involved putting (unwittingly) thread-unsafe data into this shared cache. In IntelliJ Rust, it took a long time to notice that something was wrong in the first place, and even more investigation to determine the root cause. In Rust-Analyzer, I only waste time implementing the optimization itself. After I fixed what I thought was the last compilation error, the compiler seriously pointed out that it might not be A good idea to put A non-thread-safe structure of A containing B and D containing C into A structure shared across threads

performance

My general experience with IntelliJ Rust is that “no matter what I do, it’s not as fast as I’d like it to be”. My experience with Rust-Analyzer is the opposite, “Whatever I do, it’s fast enough”.

As an anecdote, early on I implemented fixed-point iteration name resolution algorithms in Rust-Analyzer. This is a position hostile to the IDE. If done naively, it requires considerable rework on each key. When I built Rust-Analyzer with this change, I finally saw a significant lag in completion. That’s it, “I thought,” I should stop just using naive algorithms and start applying some optimizations.” Well, as it turned out, I took a debug version of Rust-Analyzer and tested it with the release rebuild, and the problem was fixed.

By the way: The fact that debug builds tend to be ridiculously slow is a big problem for Rust.

Having good baseline performance certainly helps productivity — optimizing code for performance often makes it harder to refactor. The longer you can spend on low-level performance tuning (relative to architecture-level performance work), the less total work you will do.

Predictability of performance

More importantly, Rust’s performance is predictable. In general, if you run a program N times, you’ll get about the same result. This is in stark contrast to the JVM, where you need to do a lot of warm-up work to stabilize even microscopic benchmarks. I have never successfully run a repeatable macro benchmark in IntelliJ Rust.

More generally, the behavior of a program changes much less in the absence of a runtime. This makes the pursuit of regression more efficient.

security

For the sake of clarity, there is no difference in memory security: there are no Segfaults or heap corruption in either project. Similarly, nullpointer dereference is not a problem.

These are the most important advantages of Rust over other system languages, but they are not important for current applications.

conclusion

I think the unifying theme for many of the discussion points is “programming on a large scale.” Modularity, build process, and predictability only become important as the number of code, age, and number of contributors increases. I like What Titus Winters said: “Software engineering is integrated programming over time.” Rust excels at this; it is an extensible language.

What I appreciate more is that Rust might be a reasonable candidate for an almost universal language. To quote another famous quote (John Carmack), “The right tools for work are often the tools you already use.” It takes a lot of effort to switch contexts and connect different technologies. With Rust, you often don’t need to do that. It can be naturally extended to bare – metal devices. As discussed in this article, it’s also great for application-level programming. Rust even works with scripts to some extent! Rust-analyzer’s build information is better suited to Bash and Python in theory, but in practice Rust works well and is happily cross-platform.

Finally, I would like to reiterate that this case study involves only two projects, which are similar, but not twins. Context is also important: it is somewhat unusual for application programming not to rely on third-party libraries for core functionality. So while I think this experience and analysis qualitatively points in the right direction, your quantitative results can be quite different


www.deepl.com translation