The paper contains 4706 words and is expected to last 9 minutes

This article introduces Streamlit, an application framework for machine learning engineers.

Generate a semantic search engine using real-time neural network reasoning in 300 lines of Python code.

In my experience, every non-trivial machine learning project ends up full of bugs and its internal tools are hard to maintain. Often cobbled together by Jupyter Notebook and Flask applications, these tools are difficult to deploy, require master-slave reasoning, and do not integrate well with machine learning architectures such as TensorFlow GPU sessions.

Autonomous robots from Carnegie Mellon University, Berkeley, Google X LABS, and Zoox have all shown this feature. These tools usually start as small Jupyter Notebook tools, including sensor calibration tools, simulation comparison applications, lidar alignment applications, scene playback tools, and more.

As tools become more important, project management is required. As the process progresses, the requirements grow, these individual projects spawn scripts, and maintenance becomes a nightmare.

Special application building flow for machine learning engineers.

If tools take the lead, tool teams need to be formed. They can write great Vue and React code and tag laptops with declarative frameworks. They have a design process that goes like this:

Tools team’s computer protection software application build stream.

It’s a great process. But these tools require new capabilities, such as weekly updates. The tools team had ten other projects, and they said, “We’ll update your tools again in two months.”

So you need to start from scratch building your own tools, deploying the Flask application, writing HTML, CSS, and JavaScript code, and versioning everything from notebooks to stylesheets. So my old Google X friend Thiago Teixeira and I started thinking: What if you could make build tools as easy as writing Python scripts?

We want machine learning engineers to be able to create great applications without the need for a tool team. These internal tools should be a natural byproduct of machine learning workflows. Creating such a tool should feel like training a neural network or performing a special Jupyter analysis! At the same time, we want to preserve the flexibility of a powerful application framework. We want to create beautiful, high-performance tools that engineers can show off. Basically, the idea is this:

Streamlit application build flow

We had a great testing community that included engineers from Uber, Twitter, Stitch Fix, and Dropbox, and we spent a year building Streamlit, a completely free open source application framework for machine learning engineers. In each prototype, Streamlit’s core principles are simpler and purer. Respectively is:

#1: Includes Python scripts. Streamlit applications are really just scripts that run top-down, with no hidden state. You can use function calls to decompose the code. If you know how to write Python scripts, you can write Streamlit applications. For example, you could enter the following code:

import streamlit as st
st.write('Hello, world! ')Copy the code

Nice to meet you

#2: Treat widgets as variables. There are no callback functions in Streamlit! Each interaction is just a top-down rerun of the script. This approach results in really clean code:

import streamlit as st

x = st.slider('x')
st.write(x, 'squared is', x * x)Copy the code

Interactive Streamlit application with three lines of code.

#3: Reuse data and computation. What if you download a lot of data or perform complex calculations? The key is to safely reuse information at run time. Streamlit introduces a caching primitive that acts like a persistent and, by default, immutable data store, allowing Streamlit applications to safely and easily reuse information. For example, the code only from Udacity self-driving car project (https://github.com/udacity/self-driving-car) to download a data, resulting in a simple and rapid application:

import streamlit as st
import pandas as pd

# Reuse this data across runs!
read_and_cache_csv = st.cache(pd.read_csv)

BUCKET = "https://streamlit-self-driving.s3-us-west-2.amazonaws.com/"
data = read_and_cache_csv(BUCKET + "labels.csv.gz", nrows=1000)
desired_label = st.selectbox('Filter to:'['car'.'truck'])
st.write(data[data.label == desired_label])Copy the code

Use St.Cache to persist data in Streamlit runs. If you want to run the code, please follow the instructions: https://gist.github.com/treuille/c633dc8bc86efaa98eb8abe76478aa81

Run the st.cache output from the above example.

In summary, here’s how Streamlit works:

1. Run the entire script from scratch for each user interaction.

2. Streamlit specifies the latest value of the widget state for each variable configuration.

3. Caching allows Streamlit to skip redundant data extraction and computation steps.

Here is:

A user event triggers Streamlit to rerun the script from scratch. Only the cache remains constant throughout the run.

If it sounds good, try it now! Just run:

$ pip install --upgrade streamlit
 $ streamlit hello  Copy the code

The Streamlit application can now be viewed in a browser.

The local URL: http://localhost:8501

Web URL: http://10.0.1.29:8501

A Web browser linked to the local Streamlit application will automatically pop up. If nothing pops up, click the link.

To see more examples of fractal animations like this, run Streamlit Hello from the command line.

Are you still interested in fractals? It can be fascinating.

These ideas are simple, but do not prevent you from creating very rich and useful applications using Streamlit. During my time at Zoox and Google X, I saw the driverless car project rapidly expand to thousands of megabytes of visual data that needed to be searched and analyzed, including running models on images to compare performance. Every driverless car project I know of will eventually have an entire team dedicated to the tool.

Building such tools in Streamlit is easy. This Streamlit demonstration can perform semantic searches across the entire Udacity autonomous vehicle image dataset, visualize live ground markings based on manual tagging, and run the full neural network (YOLO) in real time in the application [1].

This 300-line Streamlit demo includes semantic visual search and interactive neural network reasoning.

The entire application is a completely separate 300-line Python script, much of it machine learning code. In fact, there are only 23 Streamlit calls in the entire application. Do it yourself now!

$ pip install --upgrade streamlit opencv-python
$ streamlit run
https://raw.githubusercontent.com/streamlit/demo-self-
driving/master/app.pyCopy the code

While working with the machine learning team on their project, we realized that these simple ideas have many obvious advantages:

Streamlit applications are pure Python files. So you can use both your favorite editor and debugger.

My favorite Streamlit application layouts are VSCode (left) and Chrome (right).

Pure Python scripts can work with Git and other source code management software, including submissions, pull and drop requests, questions, and comments. Because Streamlit’s underlying language is pure Python, these collaboration tools provide a great experience for free.

Because Streamlit applications are just Python scripts, version control can be done easily with Git.

Streamlit provides an immediate mode real-time coding environment. Simply click Always Rerun when Streamlit detects changes to the source file.

Click Always Rerun to enable live encoding.

Caching simplifies the setup of the computation process. Surprisingly, the link caching function automatically creates an efficient computation flow! Here’s the code adapted from the Udacity demo:

import streamlit as st
import pandas as pd

@st.cache
def load_metadata():
DATA_URL = "https://streamlit-self-driving.s3-us-west-
2.amazonaws.com/labels.csv.gz"
    return pd.read_csv(DATA_URL, nrows=1000)
@st.cache
def create_summary(metadata, summary_type):
    one_hot_encoded = pd.get_dummies(metadata[["frame"."label"]],
 columns=["label"])
    return getattr(one_hot_encoded.groupby(["frame"]), summary_type)()

# Piping one st.cache function into another forms a computation DAG.
summary_type = st.selectbox("Type of summary:"["sum"."any"])
metadata = load_metadata()
summary = create_summary(metadata, summary_type)
st.write('## Metadata', metadata, '## Summary', summary)Copy the code

A simple computation process in Streamlit. If you want to run the code, please follow the instructions: https://gist.github.com/treuille/ac7755eb37c63a78fac7dfef89f3517e.

The usual procedure is load_metadata→create_summary. Each time the script is run, Streamlit only recalculates any subset of the required process to get accurate information.

To improve the performance of the application, Streamlit simply recalculates everything needed to update the UI.

Streamlit is built for gpus. Streamlit allows direct access to machine primitives, such as TensorFlow and Pytorch, and complements these libraries. For example, in this demo, Streamlit’s cache stores photos of all nvidia celebrities GAN[2]. When the user updates the slider, this method allows almost instant reasoning.

This Streamlit app demonstrates Nvidia’s celebrity photo GAN[2] model, which uses Guan shaobo’s TL-GAN[3].

Streamlit is a free open source library, not a proprietary Web application. You can use the Streamlit application’s preset software without contacting us, and you can even run Streamlit locally on a laptop without an Internet connection! In addition, existing projects can adopt Streamlit.

There are several ways to use Streamlit. (ICONS are provided by fullVector/Freepik.)

This is just a superficial description of Streamlit’s capabilities. What is most exciting about Streamlit is how easily these primitives can be combined to implement complex applications in the form of scripts.

Streamlit component block diagram. More features coming soon!

Leave a comment like follow

We share the dry goods of AI learning and development. Welcome to pay attention to the “core reading technology” of AI vertical we-media on the whole platform.



(Add wechat: DXSXBB, join readers’ circle and discuss the freshest artificial intelligence technology.)