2020 is over, and in the past year, many excellent Python libraries have emerged.
Unlike numpy, TensorFlow, and Pandas, which have been maintained and iterated over the years and are familiar to most Python developers.
Today’s introduction to Python libraries is fresh from 2020, and the 10 Python libraries presented in this article have been very well maintained.
No more nonsense, the formal content of this article begins below!
1. Typer
You probably don’t write CLI applications very often, but when you do, you’re likely to run into a lot of obstacles.
Following the runaway success of FastAPI, Tiangolo has brought the same principles to Typer[1] : a new library that allows you to write command-line interfaces using Python 3.6+ ‘s type-hinting capabilities.
This design really makes the Typer stand out. In addition to making sure your code is properly documented, you can also get a CLI interface with validation with a few minor tweaks.
And by using type hints, you can get auto-complete in your Python editor (such as VSCode), which will improve your productivity.
To enhance its functionality, Typer has made a number of optimizations and improvements over Click, another well-known CLI tool. This means that it can leverage all of its benefits, community, and plug-ins while starting simple with less template code.
2. Rich
Following the topic of the CLI, terminals tend to be flat white, which is very difficult to distinguish and read.
Do you want to add color and style to your terminal output? Printing complex forms? Nice progress bar? Markdown? Emojis?
Rich[2] can meet your requirements.
Take a look at the sample screenshots to see what it does.
3. Dear PyGui
Although, as we’ve seen, terminal applications can be nice to look at, sometimes that’s not enough and you need a real GUI.
To this end, Dear PyGui[3], a Python port of the popular Dear ImGui C++ project, was born.
Dear PyGui takes advantage of the so-called real-time mode paradigm popular in video games.
This means that dynamic GUIs are drawn frame by frame, without persisting any data. This makes this tool fundamentally different from other Python GUI frameworks.
It has high performance and uses a computer’s GPU to facilitate the construction of highly dynamic interfaces that are often needed in engineering, simulation, gaming, or data science applications.
4. PrettyErrors
It’s one of those things that makes you think, “How come no one thought of it before?” The Python library.
PrettyErrors[4] does only one thing, and it does it well.
In terminals that support color output, it turns the jumble of error messages into something more suitable for our human eyes to parse.
No longer struggling to scan the entire screen, location error message…… Now you can find it at a glance.
5. Diagrams
We programmers like to solve problems and code.
But sometimes, as part of a much needed project document, we need to explain complex architectural designs to other colleagues.
Traditionally, we have turned to GUI tools, where we can work on diagrams and visualizations and put it in presentations and documents.
But that’s not the only way.
Diagrams[5] allow you to map cloud system architecture directly in Python code without using any design tools.
With just a few lines of code, you can create a brilliant architectural diagram.
6. Hydra and OmegaConf
When doing research and experimentation on machine learning projects, there are always countless Settings to try.
In some applications, configuration management becomes very complex. Having a structured way to deal with this complexity can greatly improve development efficiency.
Hydra[6] is a tool that allows you to build configurations in a composable manner and overwrite parts from the command line or configuration file.
├── ├─ ├─ ├─ ├─ ├─ base.yaml, base.yaml, base.yaml └ ─ ─ train_model. PyCopy the code
Another tool, OmegaConf[7], provides a consistent API for the foundation of a layered configuration system, supporting different sources such as YAML, configuration files, objects, and CLI parameters.
7. PyTorch Lightning
Every tool that can increase the productivity of a data science team is invaluable.
There is no reason for someone working on a data science project to reinvent the wheel every time, mulling over how best to organize the code in the project, using some “PyTorch Boilerplate “that is not easy to maintain, or trading potential control for higher levels of abstraction.
Lightning[8] helps improve productivity by decoupling science from engineering. It’s kind of like TensorFlow’s Keras, in the sense that it makes your code cleaner.
However, it doesn’t take control away from you. It’s still PyTorch, and you can use all the usual apis.
This library helps teams leverage software engineering good practices, organization around components and clear accountability, to build high-quality code that can be easily extended to train on multiple Gpus, Tpus, and cpus.
The library can help junior members of the data science team produce better results, while more experienced members will enjoy it because of the increased overall productivity without giving up control.
8. Hummingbird
Not all machine learning is deep learning. Many times your model consists of more traditional algorithms implemented in SciKit-Learn (such as random forest), or you use gradient lifting methods such as the popular LightGBM and XGBoost.
However, a lot is happening in deep learning. Frameworks like PyTorch are advancing at a stifling pace, and hardware is being optimized to run tensor computing at faster speeds and with lower power consumption. Wouldn’t it be beautiful if we could harness this work to run our traditional methods faster and more efficiently?
That’s where Hummingbird[9] comes in.
Microsoft’s new library compiles your trained traditional ML models into tensor calculations.
This is good, because it frees you from the need to redesign the model.
As of now, Hummingbird supports conversion to PyTorch, TorchScript, ONNX, and TVM, as well as various ML models and vectorers.
9. HiPlot
Almost every data scientist has dealt with high-dimensional data at some point in his or her career.
Unfortunately, the human brain isn’t wired enough to process this kind of data intuitively, so we have to resort to other technologies.
Earlier this year, Facebook released HiPlot[10], a library to help discover correlations and patterns in high-dimensional data, using parallel graphs and other graphical ways to represent information. The concept is explained in their post, but is basically a nice, convenient way to visualize and filter high-dimensional data.
HiPlot is interactive and extensible, and you can use it from your standard Jupyter Notebooks or through its own servers.
10. Scalene
As the Python library ecosystem grows more complex, we find ourselves writing more and more code that relies on C extensions and multithreaded code.
This becomes a problem when comparing performance, because CPython’s built-in profiling tools don’t handle multithreading and native code correctly.
That’s when Scalene[11] comes to the rescue.
Scalene is a CPU and memory profiling tool for Python scripts that handles multithreaded code correctly and distinguishes between running Python and native code.
You don’t need to modify your code, you just run your script from the command line with Scalene and it will generate a text or HTML report for you showing CPU and memory usage for each line of code.
conclusion
A good tool, developers can get twice the result with half the effort.
This is especially true for programming languages like Python that rely heavily on third-party toolkits.
Thanks to these excellent tools, the Python ecosystem is in good shape.
Improve your development efficiency with these 10 excellent Python libraries.