Under a variety of conditions, how to complete the reality of the game scene face recognition unlock function? Yunjia community invited Tencent technology product manager — Gao Shulei, share how he used 200 lines of code, from system architecture, hardware selection, to system construction, step by step to achieve this exquisite face recognition unlock application, hope to communicate with you.
I. Case Overview
1. The background
Help a friend realize a face recognition to unlock the function, used in his real life game business. The operation has been stable for several months and the experience is good. Taking the time to stay at home, I will sort out the implementation process of this application.
Generally speaking, the requirement description is simple, but due to more constraints, some attention was paid to the architecture and selection.
2. Deployment effect
Since the game is still online, I won’t post a video of it here. The deployment effect is shown in the following figure:
- After the player discovers the space and enters it, the screen displays a live image of himself in the current scene.
- When the player gets close to observe, the current frame is captured for face recognition, and the watermark subtitle “Authentication” appears in the real-time screen.
- When face authentication fails, the watermark subtitle of the real-time screen changes to Authentication Failed. The subtitle disappears after 2 seconds and restores to the initial state. The player continues to look for clues to the game and re-authenticate.
- When the face authentication is successful, the watermark subtitle of the real-time picture changes to “Authentication succeeded”, and the door of the safe box pops open. Enter the follow-up game.
Two, product requirements
1. Requirement description
When the requirements are put forward, the core logic is not complicated.
- Face recognition: Authenticates through face recognition.
- Lock opening management: if the door passes authentication, it will be opened; if it fails, it will be locked.
- Feedback tips: Live video feedback is required, with clear instructions to optimize the player experience.
2. Constraints
After all, it’s a business, so when it comes to business, the requirements for practicality and cost are high, and the key is not to interfere with the game, but to keep the player’s experience.
- Low cost: need low construction cost, low maintenance cost.
- Easy maintenance: low requirements on the technical level of maintenance personnel, when hardware and software failure, any assistant can quickly recover.
- High reliability: High identification accuracy, strong fault tolerance, low failure rate in continuous operation of the system.
- Limited space: after removing the display screen, electromagnetic lock and safe, the space of other structures shall not exceed 20cm*15cm*15cm.
- Insufficient lighting: small real scene space, top light without side light, long exposure time.
- Universal power supply: provides only 5V and 12V DC ports.
- Parallel processing: the authentication process is parallel with the feedback process. During the authentication process, the feedback system should not be interrupted or blocked, so that players will have obvious interruption and stuck experience.
- Weak network environment: Due to room partition, network sharing, so the network speed is limited, there is a sudden delay.
3. Functional design
There are many possible architectural solutions (a comparison of the different solutions is provided at the end of this article), and the following are expanded to illustrate the solutions that will eventually come online.
(1) Set up the process
For flow and effects, see the “Player Experience” section above.
(2) Configurable content
A. Tencent cloud key pair
Modify the configuration file to adapt to the Tencent cloud account switching function (test account/official account).
B. ID of the staff library
Modify the configuration file to specify different staff libraries (test/formal).
C. Watermark prompt
Replace the corresponding picture to achieve the watermark replacement. The reason for using image management instead of text configuration is that image configuration mode does not require font support, does not require configuration of display size, and is easy to embed patterns. Because what you see is what you get, the requirements for maintenance personnel are low.
D. Shutdown options
You can configure whether to automatically shut down after the task is complete. For game environment reset preparation, reduce reset workload.
(3) Operation and maintenance
A. System operation management
When the scenario starts, power on the system in a unified manner. After the authentication is passed, the system automatically shuts down to complete the reset.
B. Troubleshoot the fault
Hardware and software faults: failure to boot, no display on the boot, abnormal display system on the boot, unknown abnormality on the boot, etc., replacement of raspberry PI or other hardware. Network fault: The system runs normally but cannot be authenticated. You can check the network and cloud logs to solve the network problem. Cloud product abnormality: it has been running for 4 months and has not happened, so it can be ignored. If it happens, contact cloud after-sales service.
(4) Cost analysis
Hardware cost: 500 ~ 600 yuan.
Spare parts cost: 500 ~ 600 yuan according to 1:1 spare parts.
Operation cost: cloud 0 yuan, using free quota; Electricity and network charges, ignored.
Two, technical implementation
1. System architecture
(1) Hardware composition
Raspberry PI: terminal master camera: video input sensor: ultrasonic ranging display: video output relay: control electromagnetic lock Electromagnetic lock: control the door of the safe box
(2) Key features
Image recognition: Use image recognition instead of video streaming to reduce network bandwidth requirements. Low recognition requirements: Underexposed photos also have high recognition rates. Trigger recognition: the player is active in the scene for a long time, and trigger mode avoids high-frequency authentication and false unlocking, and reduces authentication costs. Range selection: ultrasonic sensor technology mature, low cost (3 yuan); High cost of laser sensor (30 yuan) Multi-process: video processing and monitoring authentication are realized by two processes to avoid blocking and other situations, and inter-process communication is used to achieve reliable interaction.
2. System construction
(1) Tencent cloud configuration
A. Register an account
Follow the documentation to obtain the API key
B. Configure face recognition
Visit the console of the official website and choose Create Staff Library > Create Staff > Upload Photo to establish the authentication basis. The “staff library ID” used is the key information, which is used to specify the staff library that the authentication action matches when the SUBSEQUENT API call is identified. Note: Since this case identifies only one person, there is no need to match the person ID, so the person ID is not specified.
(2) Raspberry PI configuration
A. Install the system
Visit www.raspberrypi.org to obtain the image and install it. Note that the desktop version must be installed; otherwise, you need to manage HDMI output separately.
B. Configure the network
Go to the command line, run “raspi-config” and select “Network Options” to configure the WiFi access point. To fix the IP address, edit the /etc/dhcpcd.conf file and add the configuration information.
Please refer to your local network plan for detailsInterface wlan0 static IP_address =192.168.0.xx/24 Static routers=192.168.0.1 Static domain_name_Servers =192.168.0.1 192.168.0.2Copy the code
C. Install the Tencent Cloud SDK
Refer to the guidance document to install the dependency library that calls Tencent Cloud API.
sudo apt-get install python-pip -y
pip install tencentcloud-sdk-pythonCopy the code
D. Install the image processing library
Python2.7 is installed by default, but the OpencV library is not available. (Download package volume is large, the default source is foreign station, relatively slow. Raspberry pie to change the domestic source method, please search by yourself, and choose the source station near you)
sudo apt-get install libopencv-dev -y
sudo apt-get install python-opencv -yCopy the code
E. Deploy code
Go to Github and copy the SRC folder to /home/pi/faceID. Change/home/PI/faceid/config. The configuration information in json, must be changed to your cloud API keys (sid/skey), personnel ID (facegroupid), other configuration adjustments as needed.
F. Configure automatic startup
Need to configure since the launch of the graphical interface, to guarantee the video output from the HDMI output to screen, edit the/home/PI/config/autostart/faceid. Write the following desktop
Type=Application
Exec=python /home/pi/faceid/main.py Copy the code
(3) Hardware wiring
Figure of RASPberry PI GPIO:
- CSI interface
B. Ultrasonic sensor
- TrigPin: BCM-24 / GPIO24
- EchoPin: BCM-23 / GPIO23
- VCC: 5 v
- GND: connect to GND
C. relay
4 pin side with raspberry PI GPIO pin
- VCC: 5 v
- GND/RGND: connect to GND
- CH1 : BCM-12 / GPIO12
3 Port side magnetic lock
- The initial state is electromagnetic locking normally closed end.
- For details about relay principles, see 3.3.4 Hardware related section.
(3) Test run
After the above work is completed, the system is powered on and started. The local feedback can be used to check the display screen, and the system logs can be viewed based on the cloud identification results.
3. Code logic and related technology
(1) Pseudo code of process
# Monitor the authentication process - main processGet application configuration (API ID/Key, etc.) Initialize GPIO pin (ready to control sensor, relay) Start video management process (auxiliary process) loop start:ifNot ranging meets trigger criteria:continueCommunicate with the auxiliary process (capture the current frame, save to the specified path, and add the "authentication" watermark) to call the cloud API, using the frame image face recognitionifRecognition success: Communicate with secondary process (change watermark to "authentication success") wait 5 seconds shutdown or continue running (as specified by su2HALT field in config.json)else: Communicates with the secondary process (change watermark to "authentication failed") wait 2 seconds to communicate with the secondary process (clear watermark)# Video Management process - secondary processInitialize camera cycle start: take frame Take interprocess shared queue according to message for different operations (frame image save/different watermark/no processing) output frameCopy the code
(2) Video and recognition
A. Live video
As shown in the pseudocode above, real-time video is displayed through frame by frame processing and continuous output.
B. Trigger identification
The distance sensor confirms that the object is near, and the distance change is less than 2cm within 0.3 seconds, and it is confirmed as the state to be certified. Then delay 0.3 seconds, image frame capture. The reason for the delay again is that when the object stops, there will be torsion, fine tuning and other actions. If the frame is taken directly, there will be ambiguity due to insufficient lighting (constraints mentioned above), so the delay is again to ensure stable image capture.
C. Face recognition
Please refer to the documentation.
(3) Image watermarking
A. Watermarking principle
Opencv, provides a variety of image processing functions, such as: graphic processing (graph plus word), graph processing (graph between the addition/subtraction/multiplication/division/bit operation) and so on. Through different processing methods, we can achieve a variety of effects such as base image plus word, base image plus image, mask processing and so on. The mask processing method based on bit operation is used in this case.
B. Watermarked images
In order to facilitate maintenance and update, this case uses pictures as the watermark source to avoid font library constraints and increase flexibility. It is easy to add graphics into the watermark and directly define the size of the watermark by resolution, wySIWYG. The default watermark image is black characters on white background.
C. Watermark processing logic
In order to highlight the floating effect of watermark, the black area in the watermark image is transparent and superimposed on the original image. Due to the font transparency effect, the watermark font color changes with the basic video, and the effect is more obvious. The source code that
# img1 is the current video frame (base image), img2 is the read watermark image
def addpic(img1,img2):
# Focus area ROI- Take the image that will be watermarked in the base image
rows, cols = img2.shape[:2]
roi = img1[:rows, :cols]
# Image graying - Avoid black and white watermarking
img2gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
# Generate mask - filter light color, bit operation is not
ret, mask = cv2.threshold(img2gray, 220, 255, 3) #cv2.THRESH_BINARY
mask_inv = cv2.bitwise_not(mask)
# Generate watermark area image - Cut out the font part of the base image, generate the final image of the watermark area, replace the original image watermark area
img1_bg = cv2.bitwise_and(roi, roi, mask=mask_inv)
dst = cv2.add(img1_bg, img2)
img1[:rows, :cols] = dst
return img1Copy the code
Schematic diagram of watermark effect (Schematic diagram enlarges the watermark area to highlight the effect, but the actual application scheme has a small watermark area)
(4) Hardware related
A. Ultrasonic ranging
Ultrasonic sensor (4 pins: VCC,Trig, Echo, GND),Trig terminal output a high level greater than 10μs, activated to send ultrasonic wave, and after receiving reflected wave, Echo terminal output a continuous high level, the duration is the time of “send to receive wave”. Namely: ranging result (m)= high level duration of Echo end *340 m /2
B. relay
The 5V relay module used has two sides of wiring, one side is for power supply and signal (4 pins, compatible with 3.3V signal), the other side is for channel opening and closing management (3 ports). The relay realizes a mode of “single pole and double switch” on the “path management side”, and controls the direction of single pole through the high and low level of “CH1 pin” on the “power supply and signal side”. In the installation process, the power supply of the electromagnetic lock is connected to the normally closed end of the relay by default. After the signal is given to the relay, the relay switches to the normal beginning, and the electromagnetic lock is powered off and unlocked.
c. GPIO
GPIO (general-purpose input/output) provides the connection between hardware in the form of pins. Raspberry PI 3B+ has 40 GPIO pins (please refer to the reference figure in 3.2.3 Hardware Connection). You can use the rpi. GPIO library installed by default in Python to operate Raspbian, the official raspberry PI operating system.
Iii. Introduction of other schemes
1. Comparison of scheme selection
The core of the design lies in the face authentication module, which directly affects the cost and stability, and finally chooses the above scheme (balance cost, maintainability and reliability). There have been several other alternative face recognition schemes:
(1) Local recognition scheme A
Using ESP-EYE chip, all completed by chip, rely on ESP-IDF, ESP-WHO, using C for development. Low hardware cost (module cost 189*2), high development and maintenance cost (C development). Problem: Difficult to update configuration and troubleshooting. Applicable to mass deployment scenarios.
(2) Local identification scheme B
Face recognition using raspberry PI directly, mature scheme, rich open source code. Medium hardware cost, low development cost, high maintenance cost. Problem: Raspberry PI has a high load, even with the interval frame algorithm, it only stays below 20fps, which is obviously stuttering. If further tuning, limited by personal experience problems, it may be difficult to maintain long-term stable operation.
(3) Local recognition C scheme
Use BM1880 edge computing development board or other image processing board, community reputation is good, has the framework support. Problem: High hardware cost (module cost 1000*2), high development and maintenance cost (C development). If you use the force bar, you need X86_64 as the base platform, the cost is limited, the complexity is unchanged. This method is applicable to extended capability scenarios.
(4) Cloud recognition scheme A
Tencent cloud video intelligent analysis products are used to simplify the terminal architecture, and raspberry PI Zero is used to stream up the cloud (the implementation scheme will be released later) to obtain recognition results and support high-frequency multiple retrieval and other features. Low deployment cost (terminal video module 150 yuan), low operation cost (current 0.28 yuan/minute, based on the scenario of a single operation for 20 minutes, the cost of a single game 5.6 yuan) Problem: Heavy dependence on network stability, interruption of flow and other conditions affect the experience. Under the network constraints of this case, the application effect is affected, and it is more suitable for the application scenarios with good network conditions and high frequency retrieval.