• Flask Video Streaming Revisited
  • Miguel Grinberg
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: zhmhhu
  • Reviewer:

About three years ago, I wrote a post on this blog called Video Streaming with Flask, in which I proposed a very useful Streaming server that uses the Flask Generator view function to stream motion-JPEG streams to a Web browser. In that article, my intention was to demonstrate simple and practical streaming response, which is a little-known feature of Flask.

The article was popular not so much because it taught readers how to implement streaming responses, but because so many people wanted to implement streaming video servers. Unfortunately, when I write this article, my focus is not on building a powerful video server so I often get questions and requests for advice from readers who want to use the video server for real applications, but quickly discover its limitations.

Review: Use Flask’s video stream

I recommend that you read the original article to familiarize yourself with my project. In a nutshell, this is a Flask server that uses streaming responses to provide a stream of video frames captured from a camera in Motion JPEG format. This format is very simple, though not the most efficient, and has the following advantages: it is supported natively by all browsers and requires no client-side scripting. For this reason, it is a fairly common format for security cameras. To demonstrate the server, I wrote a camera driver for raspberry PI using the camera module. For those of you who don’t have a raspberry PI and only have a handheld camera, I also wrote an analog camera driver that can transfer a series of JPEG images stored on disk.

Run the camera only when there is a viewer

One reason people don’t like raw streaming servers is that the background thread from raspberry PI’s camera capturing video frames starts when the first client connects to the stream, but then it never stops. A more efficient way to handle backend threads is to run them only with viewers, so that the camera can be turned off when no one is connected.

I have just implemented this improvement. The idea is that every time a client accesses a video frame, the current time of that visit is recorded. The camera thread checks this timestamp and exits if it finds it exceeds ten seconds. With this change, when the server runs without any clients for ten seconds, it shuts down its camera and stops all background activity. As soon as the client connects again, the thread is restarted.

Here’s a brief description of the improvements:

class Camera(object):
    #...
    last_access = 0  The last client to access the camera

    #...

    def get_frame(self):
        Camera.last_access = time.time()
        #...

    @classmethod
    def _thread(cls):
        with picamera.PiCamera() as camera:
            #...
            for foo in camera.capture_continuous(stream, 'jpeg', use_video_port=True):
                #...
                # If no client accesses the video frame
                Stop the thread after 10 seconds
                if time.time() - cls.last_access > 10:
                    break
        cls.thread = None
Copy the code

Simplified camera class

A common problem that many people have mentioned to me is the difficulty of adding support for other cameras. The Camera class I implemented for the Raspberry PI is quite complex because it uses a background capture thread to communicate with the Camera hardware.

To make it easier, I decided to move all the generic functionality for background processing of frames to the base class, leaving only the task of fetching frames from the camera to implement in subclasses. The new BaseCamera class in the module base_camera.py implements this base class. Here’s what this generic thread looks like:

class BaseCamera(object):
    thread = None  # Background thread that reads frames from the camera
    frame = None  The background thread stores the current frame here
    last_access = 0  The last client to access the camera
    #...

    @staticmethod
    def frames():
        """Generator that returns frames from the camera."""
        raise RuntimeError('Must be implemented by subclasses.')

    @classmethod
    def _thread(cls):
        """Camera background thread."""
        print('Starting camera thread.')
        frames_iterator = cls.frames()
        for frame in frames_iterator:
            BaseCamera.frame = frame

            # If no client accesses the video frame
            Stop the thread after 10 seconds
            if time.time() - BaseCamera.last_access > 10:
                frames_iterator.close()
                print('Stopping camera thread due to inactivity.')
                break
        BaseCamera.thread = None
Copy the code

This new version of raspberry PI’s camera thread uses another generator to make it universal. The thread expects the frames() method, which is a static method, to become a generator that is implemented in a specific subclass of the different camera. Each item returned by the iterator must be a VIDEO frame in JPEG format.

Here’s how an analog camera that returns a still image fits into this base class:

class Camera(BaseCamera):
    """Simulates camera implementation by streaming the repeated sequences of 1.jpg, 2.jpg and 3.jpg files at the rate of one frame per second."""
    imgs = [open(f + '.jpg'.'rb').read() for f in ['1'.'2'.'3']]

    @staticmethod
    def frames():
        while True:
            time.sleep(1)
            yield Camera.imgs[int(time.time()) % 3]
Copy the code

Notice how in this release the Frames () generator achieves a frame per second rate by simply sleeping between frames.

With the redesign, the raspberry PI camera’s camera subclass was also made simpler:

import io
import picamera
from base_camera import BaseCamera

class Camera(BaseCamera):
    @staticmethod
    def frames():
        with picamera.PiCamera() as camera:
            # let camera warm up
            time.sleep(2)

            stream = io.BytesIO()
            for foo in camera.capture_continuous(stream, 'jpeg', use_video_port=True):
                # return current frame
                stream.seek(0)
                yield stream.read()

                # reset stream for next frame
                stream.seek(0)
                stream.truncate()
Copy the code

OpenCV camera driver

Many users complained that they could not access the Raspberry PI with the camera module, so they could not try to use the server except for the analog camera. Now that it’s much easier to add camera drivers, I wanted an OpencV-based camera that supports most USB Webcams and laptop cameras. Here is a simple camera driver:

import cv2
from base_camera import BaseCamera

class Camera(BaseCamera):
    @staticmethod
    def frames():
        camera = cv2.VideoCapture(0)
        if not camera.isOpened():
            raise RuntimeError('Could not start camera.')

        while True:
            # Read the current frame
            _, img = camera.read()

            # Encode as a JPEG image and return it
            yield cv2.imencode('.jpg', img)[1].tobytes()
Copy the code

With this class, the first camera detected by your system will be used. If you’re using a laptop, this could be your built-in camera. To use this driver, you need to install the OpenCV binding for Python:

$ pip install opencv-python
Copy the code

The camera choice

The project now supports three different camera drivers: Analog, Raspberry PI, and OpenCV. To make it easier to choose which driver to use without having to edit the code, the Flask server looks for CAMERA environment variables to know which classes to import. This variable can be set to PI or OpencV. If not, the analog camera is used by default.

The way to do it is very generic. Regardless of the value of the CAMERA environment variable, the server expects the driver to be in a module named camera_ $camera.py. The server will import this module and then look for the Camera class in it. The logic is actually quite simple:

from importlib import import_module
import os

# import camera driver
if os.environ.get('CAMERA'):
    Camera = import_module('camera_' + os.environ['CAMERA']).Camera
else:
    from camera import Camera
Copy the code

For example, to start an OpenCV session from bash, you can do the following:

$ CAMERA=opencv python app.py
Copy the code

Using the Windows command prompt, you can do the following:

$ set CAMERA=opencv
$ python app.py
Copy the code

Performance optimization

On several other observations, we saw that the server was consuming a lot of CPU. The reason is that there is no synchronization between the background thread capturing frames and the generator sending those frames back to the client. Both run as fast as possible, regardless of the speed of the other side.

In general, it makes sense for background threads to run as fast as possible, because you want the frame rate per client to be as high as possible. But you definitely don’t want the generator that supplies frames to the client to run faster than the camera that generated them, because that means sending duplicate frames to the client. While these duplicates do not cause any problems, they have no benefit other than increasing CPU and network load.

Therefore, a mechanism is needed by which the generator only passes the original frame to the client, and if the transfer loop within the generator is faster than the frame rate of the camera thread, the generator should wait until a new frame is available, so it should adjust itself to match the camera rate. On the other hand, if the transfer loop is running at a slower rate than the camera thread, it should never lag behind in processing frames and should skip certain frames to always deliver the latest frame. Sounds complicated, right?

The solution I wanted was to let the camera thread signal notification generator run when a new frame was available. Generators can then block while they wait for signals before sending the next frame. When I looked at the synchronization unit, I saw that threading.Event was the function that matched this behavior. So, basically, each generator should have an event object, and then the camera thread should signal all active event objects to notify all running generators when a new frame is available. The generator passes frames and resets its event objects, then waits for them to proceed to the next frame again.

To avoid adding event handling logic to the generator, I decided to implement a custom event class that automatically creates and manages separate events for each client thread using the caller’s thread ID. To be honest, it’s a bit complicated, but the idea comes from how Flask’s context-local variables are implemented. The new event class is called CameraEvent and has wait(), set(), and clear() methods. With the support of this class, rate control mechanisms can be added to the BaseCamera class:

class CameraEvent(object):
    #...

class BaseCamera(object):
    #...
    event = CameraEvent()

    #...

    def get_frame(self):
        """Returns the current frame of the camera."""
        BaseCamera.last_access = time.time()

        # wait for a signal from the camera thread
        BaseCamera.event.wait()
        BaseCamera.event.clear()

        return BaseCamera.frame

    @classmethod
    def _thread(cls):
        #...
        for frame in frames_iterator:
            BaseCamera.frame = frame
            BaseCamera.event.set()  # send signal to clients

            #...
Copy the code

The magic done in the CameraEvent class enables multiple clients to individually wait for a new frame. The wait() method allocates a separate event object for each client using the current thread ID and waits for it. The clear() method resets the event associated with the caller’s thread ID so that each generator thread can run at its own speed. The set() method invoked by the camera thread signals the event objects assigned to all clients and also deletes any unserved events, because this means that the client associated with these events is closed and the client itself no longer exists. You can see the implementation of the CameraEvent class in the GitHub repository.

To give you an idea of the extent of the performance improvement, consider that the analog camera driver consumed approximately 96% of the CPU before this change, because it consistently sent repeated frames at a rate much higher than one frame per second. After these changes, the same stream consumes about 3% of the CPU. In both cases, only one client views the video stream. OpenCV drivers are reduced from about 45% CPU per client to 12% CPU per new client, with an increase of about 3%.

Deploying the Web Server

Finally, I think that if you are going to actually use this server, you should use a more powerful Web server than what comes with Flask. A good option is to use Gunicorn:

$ pip install gunicorn
Copy the code

With Gunicorn, you can run the server as follows (remember to set the CAMERA environment variable to the CAMERA driver of your choice first) :

$gunicorn --threads 5 --workers 1 --bind 0.0.0.0:5000 app:appCopy the code

Threads 5 option tells Gunicorn to handle up to five concurrent requests. This means that with this value set, you can have up to five clients watching the video stream at the same time. — Workers 1 option restricts the server to a single process. This is required because only one process can connect to the camera to capture frames.

You can increase the number of threads, but if you find that you need a lot of threads, it may be more efficient to use an asynchronous framework rather than threads. Gunicorn can be configured to use two frameworks compatible with Flask: GEvent and Eventlet. To enable the video streaming server to use these frameworks, the camera background thread has a small addition:

class BaseCamera(object):
    #...
   @classmethod
    def _thread(cls):
        #...
        for frame in frames_iterator:
            BaseCamera.frame = frame
            BaseCamera.event.set()  # send signal to clients
            time.sleep(0)
            #...
Copy the code

The only change here is to add sleep (0) to the camera capture loop. This is necessary for both Eventlet and GEvent ß because they use collaborative multitasking. These frameworks achieve concurrency by having each task free the CPU by calling a function that performs network I/O or by explicitly executing it. Since there is no I/O, the sleep function is executed to free the CPU.

Now you can run Gunicorn using gEvent or eventlet worker, as shown below:

$CAMERA=opencv gunicorn --worker-class gevent --workers 1 --bind 0.0.0.0:5000 app:appCopy the code

The –worker-class gevent option here configureGunicorn to use the gevent framework (you must install it with PIP install gEvent). You can also use –worker-class eventlet if you wish. As mentioned above, –workers 1 is limited to a single process. Eventlet and GEvent Workers in Gunicorn are assigned a thousand concurrent clients by default, so this should be more than this server can support.

conclusion

All of the above changes are contained in the GitHub repository. I hope you make these improvements to make the experience better.

Before I close, I want to provide quick answers to other questions about this server:

  • How do I set the server to run at a fixed frame rate? Configure your camera to send frames at that rate, and then sleep enough time during each iteration of the camera’s transmission loop to run at that rate.

  • How to increase frame rate? The server I describe here serves video frames at the fastest possible rate. If you need a better frame rate, try configuring your camera for smaller video frames.

How do I add sound? That’s really hard. The Motion JPEG format does not support audio. You will need to transfer the audio using a separate stream and then add the audio player to the HTML page. Even if you manage to do it all, the synchronization between audio and video won’t be very accurate.

How do I save the stream to disk on the server? Simply save the sequence of the JPEG file in the camera thread. To do this, you may want to remove the automatic mechanism for terminating background threads when there is no viewer.

How do I add a play control to a video player? Motion JPEG doesn’t allow users to interact with it, but if you want it, you can control playback with a little finesse. If the server saves all jpeg images, you can pause them by having the server send the same frames over and over again. When the user resumes playing, the server will have to serve “old” images loaded from disk, since the user is now in DVR mode rather than watching the stream in real time. This could be a very interesting project!

That’s all for this article. If you have any other questions, let us know!

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.