I said I would not write articles this year, but this editor is much better than Zhihu’s. The formula is great too! I don’t want to write zhihu anymore. I’m here to write quantum computation. Anyone? ! How sweet! Add a drumstick to your editor programmer! All right, back to business
This is a question that many people ask.
I never thought I could make it clear myself. Julia can use LLVM, Python and everything else. What’s so good about Julia? Most of the content in this article is compiled from the opinions I heard and read after a recent 1.0 discussion with some people (like Hong Hong) and some Googling.
Then I was also corrected to one point: Julia is only suitable for scientific calculation. After hearing these opinions, I think Julia if the ecology can be well done and the developers with technical ability can be attracted to try at this stage, there may be a lot of good things.
Let’s start with Julia’s first paper
Let’s go back to Julia’s first paper:
Arxiv.org/pdf/1209.51…
At the beginning of the article, you can see that what Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman actually tried to solve at the beginning was a general bi-linguistic problem, The two-language problem is often a compromise between ease of use and performance. Programmers use easy-to-use dynamic languages when they need to describe high level and complex logic of algorithms, and C, Fortran when performance is sensitive. This approach works well for some applications, but it has drawbacks. So when you’re writing some parallel code, the algorithmic complexity gets really big. Writing vectorized code is very unnatural for many problems and may create intermediate variables that could have been avoided by the displayed for loop.
Because of the type conversions and memory management that need to be considered between the two languages, writing code in both languages can be more complex than writing code in either. If you don’t get the interface between the two layers of code right, it can make optimization a lot harder.
Another solution would be to enhance the performance of existing dynamic languages, such as Python, where projects like PyPy have actually been very successful [1]. These existing projects are all attempts to optimize an existing language, which is useful because the existing code can benefit directly. But none of these solutions really solves the bilingual problem. Language design decisions that assume the interpreter’s language undermine its ability to generate efficient code. As Henry Baker observed of Common Lisp:
. the polymorphic type complexity of the Common LISP library functions is mostly gratuitous, and both the efficiency of compiled code and the efficiency of the programmer could be increased by rationalizing this complexity. [2][3][4]
At the beginning of the design, Julia considered how to use modern technology to efficiently accelerate dynamic language. As a result, Julia offers interactive programming and dynamics like Python, Lisp, and Ruby, while still having the same performance as statically compiled languages.
Julia’s performance is mainly derived from three features:
- Sufficient type information is naturally obtained through multiple dispatches
- Aggressive code specialization of dynamic types (e.g., template in C++)
- JIT compilation with LLVM
In fact, we have seen that Julia’s speed is not simply generated by LLVM, but by the design of the language itself
In past attempts to optimize dynamic languages, researchers have observed that programs may not actually be as dynamic as programmers think [5]
We found that dynamic features are pervasive throughout the benchmarks and the libraries they include, but that most uses of these features are highly constrained…
In this regard, existing programming language design may not have found a good balance. There is a lot of code that could actually be statically typed and executed more efficiently. But the language itself is not designed to do this. We assume that the following “dynamic” is more useful:
- The ability to run code at load and compile time, which makes compiling systems and configuration files easier
- Treating a generic arbitrary type (Any type) as the only true static type makes it possible to ignore static types when necessary
- Don’t reject well-formed code
- Program behavior is determined only by the type of the runtime (e.g. no static overloading)
Julia rejects some features that hinder optimization (such as CLOS [6]) and has the following limitations:
- The type itself is immutable
- The type of a value is immutable during its lifetime
- The environment of a local variable is not reified
- Program code is immutable (note, but new code can be generated and then executed, which is probably reflected in the macro world)
- Not all bindings are mutable (constants are allowed)
These limitations allow the compiler to see all specific local variables and then analyze them based only on local information.
I won’t translate the whole article, so Stefan summarized it in the mail list. Julia’s performance is mainly caused by the following points:
- an expressive parametric type system, allowing optional type annotation
- multiple dispatch using type annotations to select implementations
- dataflow type inference, allowing most expressions to be concretely typed
- careful design of the language and standard library to allow analysis
- aggressive code specialization on run-time types
- just-in-time compilation (using LLVM).
You can see how important the parameter type system and multiple distributions are as features of the language itself (which even directly affects the design of Julia code when it is written).
Meanwhile, Stefan commented:
LLVM is great, but it’s not magic. Using it does not automatically make a language implementation fast. What LLVM provides is freedom from doing your own native code generation, as well as a number standard low-level optimizations. You still have to generate good LLVM code. The LLVM virtual Machine is a typed register machine with an infinite number of write-once registers — it’s easier to work with than actual machine code, but not that far removed (which is the whole point).
In fact, we can see how shallow it is to say that CodeGen is dead. It is also (unspeakable) to say that Julia is nothing new but a mix of C++, R and Python.
To sum up, Julia is actually the result of limiting some of the original dynamic languages and trying to find a better balance. It’s wrong to say that it inherits the simplicity of Python, it’s wrong to say that it inherits R, and Julia doesn’t inherit C++. What Julia is saying is that we may be able to sacrifice some of the less important dynamics in order to achieve amazing speed.
Whether such a balance is optimal is a matter of opinion.
Some try
So actually, there are some attempts to challenge C/Fortran:
Pure Julia implements a BLAS:
Discourse.julialang.org/t/we-can-wr…
Pure Julia implementation of an HDF5:
Github.com/simonster/J…
A JSON implementation of pure Julia (which he could have done better, according to Honghong) :
Github.com/quinnj/JSON…
From this point of view, my understanding is actually not correct, in addition to more uniform multidimensional arrays (which is very important for physicists, otherwise there would not be so many people still using Fortran), maybe we can have a wider range of applications, not just scientific computing, machine learning, It’s more of the past that needs to be solved by the two languages.
But relative, in the past with a language can solve the problem, perhaps such a big kill tool is not easy to use. I think this is probably objective enough to describe Julia, and you can also get a sense of what it is suitable for and what it is not suitable for.
And finally, PERSONALLY, I don’t think Julia is a good fit for white right now. And not for people who want to get a job. But it is more suitable for those who have struggled with bilingualism in the past.
[1]: C. F. Bolz, A. Cuni, M. Fijalkowski, and A. Rigo. Tracing the meta-level: Pypy’s tracing JIT Compiler. In Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-oriented Languages and Programming Systems, ICOOOLPS ’09, Pages 18 — 25, New York, NY, USA, 2009. ACM.
[2]: H. G. Baker. The nimble type inferencer for common lisp-84. Technical report, Tech. Rept., Nimble Comp, 1990.
[3] : R. A. Brooks and R. P. Gabriel. A critique of common lisp. In Proceedings of the 1984 ACM Symposium on LISP and Functional Programming, LFP ’84, Pages 1-8, New York, NY, USA, 1984. ACM.
[4]. F. Morandat, B. Hill, L. Osvald, and J. Vitek. Evaluating the design of the R language. In J. Noble, editor, ECOOP 2012 Object-Oriented Programming, volume 7313 of Lecture Notes in Computer Science, Pages 104 — 131. Springer Berlin/Heidelberg, 2012.
[5]. M. Furr, J.-h. D. An, and J. S. Foster. Profile-guided static typing for dynamic scripting languages. SIGPLAN Not., Possession of 3-300, Oct. 2009.
[6] H. G. Baker. Clostrophobia: etiology and treatment. SIGPLAN OOPS Mess., 2(4): 4 — 15, Oct. 1991.