I. Project description

There is not a lot of information on the Internet about using Python to call rainbow soft SDK, as for calling ArcSoft4.0 SDK there is almost no, I am also patchwork, and according to their own learning, understanding, finally I want to do hard work out.

First, let’s talk about what we’re doing here. My side has a RTSP streaming video, my side will want to use the rainbow soft after everyone’s face under the camera | + background images to grab and store locally. At the same time, the whole process of face recognition can be seen in the local window in real time. And can perform high quality de-weighting according to FaceID, leaving only the best quality group. This can greatly reduce the number of repetitive face pictures captured.

Ready to dry ~!

Two, environmental preparation

  • Development Environment: PyCharm 2018.3.5 (Professional Edition)
  • Python version: Python 3.7.3
  • Opencv is a Python library for video frame processing, image processing, display, etc.
  • ArcSoft SDK: Windows x64 c++ V4.0

From the above list, preparing things is extremely simple, and Python is relatively easy, as long as you are familiar with it.

Interface and type mapping

This section mainly maps the types and interfaces in Hongsoft SDK4.0 into Python version. According to the needs of our task, only the following interfaces can be met:

  1. ASFOnlineActivation [online activation]
  2. ASFInitEngine [engine initialization]
  1. ASFDetectFaces
  2. ASFProcess [Optional]
  1. ASFGetAge [Optional]
  2. ASFGetGender [Optional]
  1. Asfimage Equality Detect [Quality Detection]

Then, according to the above interface, map the dependent types, constants, and so on.

3.1 Type Mapping

Type mapping is done first, because interface mapping depends on these types.

File: asf_struct. Py

From ctypes import * # Structure: _fields_ = [ (u'left', c_int32), (u'top', c_int32), (u'right', c_int32), (u'bottom', C_int32)] # Structure: _fields_ = [('data', c_void_p), ('dataSize', c_int32)] # _fields_ = [ ('faceRect', MRECT), ('faceOrient', c_int32), ('faceDataInfo',ASFFaceDataInfo)] # Class ASFMultiFaceInfo(Structure): _fields_ = [ (u'faceRect', POINTER(MRECT)), (u'faceOrient', POINTER(c_int32)), (u'faceNum', c_int32), (u'faceID', POINTER(c_int32)), (u'wearGlasses',POINTER(c_float)), (u'leftEyeClosed', POINTER(c_int32)), (u'rightEyeClosed', POINTER(c_int32)), (u'faceShelter', POINTER(c_int32)), (u'faceDataInfoList',POINTER(ASFFaceDataInfo))] # _fields_ = [(U 'ageArray', C_VOid_P), (U 'num', C_INT32)] # GendergenderInfo (Structure): _fields_ = [ (u'genderArray', c_void_p), (u'num', c_int32) ]Copy the code

Ha ha is very simple. If you want to use other abilities later, just add types yourself.

3.2 Constant Definition

Rainbow soft SDK in some late use of constant definition.

File: asf_common. Py

from ctypes import * from enum import Enum face_dll = CDLL("libarcsoft_face.dll") face_engine_dll = CDLL (" libarcsoft_face_engine. DLL ") # = = = = = = = = = = = = = = = = = = = = const type definition = = = = = = = = = = = = = = = = = = = = ASF_DETECT_MODE_VIDEO = 0 x00000000 # Video stream detection mode ASF_DETECT_MODE_IMAGE = 0xFFFFFFFF # Image detection mode ASF_NONE = 0x00000000 # No attribute ASF_FACE_DETECT = 0x00000001 # Detect can be either tracking or Detection. The specific selection is determined by the Detect Mode ASF_FACERECOGNITION = 0x00000004 # Face feature ASF_AGE = 0x00000008 # age ASF_GENDER = 0x00000010 # Gender ASF_FACE3DANGLE = 0x00000020 # 3D Angle ASF_FACELANDMARK = 0x00000040 # Forehead area detection ASF_LIVENESS = 0x00000080 # RGB Alive ASF_IMAGEQUALITY = 0x00000200 # Image quality detection ASF_IR_LIVENESS = 0x00000400 # IR Live ASF_FACESHELTER = 0x00000800 # Face occlusion ASF_MASKDETECT = 0x00001000 # Mask detects ASF_UPDATE_FACEDATA = 0x00002000 # Face information ASVL_PAF_RGB24_B8G8R8 = 0x201 # Picture format # detection face Angle priority - enumeration class ArcSoftFaceOrientPriority (Enum) : ASF_OP_0_ONLY = 0x1, # positive direction ASF_OP_90_ONLY = 0x2, # counterclockwise 90° direction based on 0° ASF_OP_270_ONLY = 0x3, ASF_OP_0_HIGHER_EXT = 0x5, # all Angle # = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =Copy the code

For those of you who have called DLLS in Python, I don’t need to go into too much detail.

3.3 Interface Mapping

After defining all the types and constants that will be used later, python mapping will be carried out for the interfaces that need to be used.

File: asf_func. Py

The from asf_struct import * from ctypes import * import asf_common # = = = = = = = = = = = = = = = = = = = = Api interface mapping definition = = = = = = = = = = = = = = = = = = = = # Online_activate = ASF_common.face_engine_DLL.ASFOnlineActivation Online_activate. Restype = c_int32 online_activate.argtypes = (c_char_p, c_char_p, Init_engine = ASf_common.face_engine_dlL. ASFInitEngine init_engine.restype = c_int32 init_engine.argtypes = (c_long, c_int32, c_int32, c_int32, ASFDetectFaces Detect_face.face_engine_DLL. ASFDetectFaces Detect_face. restype = c_int32 detect_face.argtypes = (c_void_p, c_int32, c_int32, c_int32, POINTER(c_ubyte), Process = ASF_common.face_engine_DLL.asfProcess process.restype = c_int32 process.argtypes = (c_void_p, c_int32, c_int32, c_int32, POINTER(c_ubyte), POINTER(ASFMultiFaceInfo), ASFGetAge get_age. Restype = c_int32 get_age. Argtypes = (c_void_p, Face_engine_dll.ASFGetGender get_gender. Restype = c_int32 get_gender.argtypes = (c_void_p, POINTER (ASFGenderInfo)) # face quality detection Api image_quality_detect = asf_common. Face_engine_dll. ASFImageQualityDetect image_quality_detect.restype = c_int32 image_quality_detect.argtypes = (c_void_p, c_int32, c_int32, c_int32, POINTER(c_ubyte), POINTER(ASFSingleFaceInfo), c_int32,POINTER(c_float), c_int32) #=======================================================Copy the code

Iv. Process design

According to our task requirements, three processing threads are designed as follows:

Responsibilities of the video frame drawing thread: loop the RTSP stream frame by frame into the queue (to ensure local disk IO performance, only frame image data into the memory queue)

Face detection thread responsibilities: cycle to obtain a frame of the picture, the picture for face detection, quality detection, frame marking display, and the high quality (>0.5) score of the picture frame related information written into the face list container

To reorder thread responsibilities: cycle through the face list container information, will be more than 2 seconds no update time FaceID corresponding list for quality score sort, take the highest, for the big picture, small picture (face picture) landing, small picture is extracted according to the face frame coordinate information.

\

With the flow chart corresponding to each thread to help you understand:

5. Main process code implementation

According to the above design, basically we have determined our code ideas, directly dry. (Not afraid you don’t understand, because the comments are really detailed)

File: asf_main. Py

from queue import Queue
import threading
import asf_func
import asf_struct
from ctypes import *
import asf_common
import cv2
import time
import uuid
import os

#线程锁,主要用来处理多线程之间同步
lock = threading.Lock()
#本地视频帧队列中size大小,表示本地最多缓冲多少数量的视频帧
frames_q_size = 50
#本地视频帧队列
frames_q = Queue(maxsize=frames_q_size)
#远程抓拍rtsp流地址
rtsp_url="rtsp://admin:[email protected]:554/h264/ch1/sub/av_stream"
#待处理的高质量抓拍人脸帧图片,key为FaceId
zp_list = {}
#不同FaceId当前最新的抓取时间,主要用来计算是否开始获取最优抓拍用
zp_time_list = {}
#大于多少阈值的人脸算作合格的人脸
storage_face_threshold = 0.5
#app_id
app_id = b"xxxxxxxxxxxxxx"
#sdk_key
sdk_key = b"xxxxxxxxxxxxxx"
#激活码
active_key = b"xxxx-xxxx-xxxx-xxxx"

#注意,以上几个数据使用字节数据

#获取到的抓拍图片,存储的目录
if not os.path.exists('IMAGE'):
    os.makedirs('IMAGE')

#调用在线激活Api进行在线激活
ret = asf_func.online_activate(app_id, sdk_key, active_key)
if ret == 0 or ret == 90114:
    print("激活成功:",ret)
else:
    print("激活失败:", ret)

#Video模式的引擎功能
video_mask = asf_common.ASF_FACE_DETECT | asf_common.ASF_IMAGEQUALITY

#Video引擎空指针,接收初始化后生成的引擎对象
video_engine = c_void_p()

#Video引擎初始化
video_ret = asf_func.init_engine(asf_common.ASF_DETECT_MODE_VIDEO, asf_common.ArcSoftFaceOrientPriority.ASF_OP_0_ONLY.value[0], 3, video_mask, byref(video_engine))

if video_ret == 0:
    print("视频模式引擎 初始化成功")
else:
    print("视频模式引擎 初始化失败:", video_ret)
    exit()

#生成不带“-”的uuid字符串
def get_uuid():
    return str(uuid.uuid1()).replace("-","")

#获取指定FaceId最优中最优的抓拍结果,并存储至本地
def storage_best_zq(face_id):
    global zp_list
    list = zp_list[face_id]
    if len(list) == 0:
        #虽然有faceid,但是不一定都会写到该列表里,为了保证内存消耗最小,只会将达到质量的图片数据写入到列表中,这里进行空列表判断
        return
    #根据质量得分进行排序(升序),0得分最小,-1得分最大
    new_list = sorted(list, key=lambda e: e[0])

    #file_uid = get_uuid()
    file_uid = str(int(time.time()))

    #生成大小图的文件名,格式:faceid编码--------uuid---------big/small.jpg  ,很容易理解
    big_file_path = "IMAGE/%s_"%(new_list[-1][6])+str(file_uid)+"_big.jpg"
    small_file_path = "IMAGE/%s_"%(new_list[-1][6]) + str(file_uid) + "_small.jpg"

    #将列表中最后一个元素中的第二个元素(图片数据)通过cv2写入到本地文件中
    cv2.imwrite(big_file_path, new_list[-1][1])

    #当前大图中,其中人脸位置信息
    left = new_list[-1][2]
    top = new_list[-1][3]
    right = new_list[-1][4]
    bottom = new_list[-1][5]

    #根据位置信息,在大图数据里面进行切割
    small_img = new_list[-1][1][top:bottom,left:right]

    #将切割后的人脸图片数据写入到本地小图文件中
    cv2.imwrite(small_file_path, small_img)
    print("存储本地抓拍图片---OK")

#实时分析并获取同一FaceId最优抓拍,并存储本地
def gen_zq():
    global zp_list
    global zp_time_list
    while(True):
        ct = time.time()
        face_id_list = list(zp_time_list.keys())
        for k in face_id_list:
            #判断字典里指定faceid的最新更新时间距离当前时间是否超过2秒钟(大家根据需要自行设置),如果超过2秒钟,认为可以进行收割了,嘿嘿~
            if ct - zp_time_list[k]>2:
                storage_best_zq(k)
                del zp_list[k]
                del zp_time_list[k]
                #处理完最优抓拍落地之后,将字段里对应faceid的数据清理掉
        time.sleep(2)
        #等待2秒钟继续分析并收割,避免频繁分析导致CPU过高,大家根据需要自行设置

#获取视频帧处理线程
#q为将抓取的视频帧数据塞入到指定的队列中
def grab_frame(rtsp_url, q):
    #使用cv2的VideoCapture打开rtsp视频流
    cap = cv2.VideoCapture(rtsp_url)
    num = 0

    while (True):
        #print(cap.isOpened())
        cap_res = cap.read()
        num += 1
        #todo num不断累加需要考虑溢出的异常
        #每隔一帧存储一次
        if num % 2 == 0:
            #为了防止视频帧队列满了,丢弃最老的一帧数据,插入新帧,这样始终保持视频帧队列中保持连续的最新的视频流数据
            #也为了防止引擎处理过慢导致队列积压所做的优化处理
            if q.full():
                print("frames queue is full, will remove one.")
                q.get()
            #以上清理完一帧再存入,这样才能保证有限的队列不存在插入不进去导致OpenCV本地队列慢导致的 各种异常,以及延迟
            q.put(cap_res)
            print(cap.isOpened())

#视频帧队列处理线程(从队列中获取每一帧,转换成图片,人脸检测+质量得分计算,筛选出高质量得分图片放入到zp_list队列中)
def detect_face(q):
    global zp_list,zp_time_list,storage_face_threshold
    while(True):
        q_value = q.get()
        #从视频帧队列中获取一帧数据
        old_img = q_value[1]
        #视频帧队列中每个元素为一个元祖,第二个元素为帧图片数据

        #sp = old_img.shape
        #img = cv2.resize(old_img, (sp[1] // 4 * 4, sp[0]))  # 四字节对齐
        #以上注释部分是为了在某些处理模式下需要对视频帧图片进行字节对齐操作,我这边的图片数据已经对齐过了,因此这里不执行也行

        img = old_img
        image_bytes = bytes(img)
        image_ubytes = cast(image_bytes, POINTER(c_ubyte))
        #以上将图片数据转换为虹软引擎能接受的类型,也就是字节数组指针(不懂的自己去补充下ctypes的相关知识)

        detect_faces = asf_struct.ASFMultiFaceInfo()
        #根据接口要求,人脸检测接口需要传入一个接受结果的结构体对象,因此这里实例化一个C结构体的对象

        ret = asf_func.detect_face(
            video_engine,
            img.shape[1],
            img.shape[0],
            asf_common.ASVL_PAF_RGB24_B8G8R8,
            image_ubytes,
            byref(detect_faces)
        )
        #按照人脸检测接口要求的格式类型,传入数据,detect_faces中会存储执行之后人脸识别结果信息

        t3 = time.time()
        if ret !=0:
            print("检测人脸失败:%s" % (ret))
            continue
            #这里返回不等于0不一定就说明有问题,不用管,通常来看基本上均为81927,属于正常现象
        if detect_faces.faceNum > 0:

            #如果检测到有多个人脸,循环对每一个人脸进行质量检测处理
            for i in range(detect_faces.faceNum):

                #质量检测接口需要传入一个单人脸信息的结构体数据,因此这里实例化一个
                single_face_info = asf_struct.ASFSingleFaceInfo()

                #将detect_faces多人脸结果中按照索引,将里面单个人脸新,拿出来,赋值给single_face_info单人脸结构体数据
                single_face_info.faceRect.left = detect_faces.faceRect[i].left
                single_face_info.faceRect.top = detect_faces.faceRect[i].top
                single_face_info.faceRect.right = detect_faces.faceRect[i].right
                single_face_info.faceRect.bottom = detect_faces.faceRect[i].bottom
                single_face_info.faceOrient = detect_faces.faceOrient[i]
                single_face_info.faceDataInfo = detect_faces.faceDataInfoList[i]

                #用来接收质量得分的变量,以下会使用引用传递的方式传给接口。
                confidenceLevel = c_float()

                #调用质量检测接口,获取单个人脸图片的质量得分
                ret = asf_func.image_quality_detect(
                    video_engine,
                    img.shape[1],
                    img.shape[0],
                    asf_common.ASVL_PAF_RGB24_B8G8R8,
                    image_ubytes,
                    byref(single_face_info),
                    0,
                    byref(confidenceLevel),
                    1
                )

                #准备指定FaceId的列表,如果不存在就初始化
                if detect_faces.faceID[i] not in zp_list:
                    zp_list[detect_faces.faceID[i]] = []

                #如果检测出来的得分大于设置的合格分数,就将其存储到指定FaceId的列表中,其余的全部过滤掉,因为低质量的分数,放进来也没用
                if confidenceLevel.value > storage_face_threshold:
                    print("发现高质量得分人脸:%s" % (confidenceLevel.value))
                    zp_list[detect_faces.faceID[i]].append((confidenceLevel.value, old_img,
                                                            detect_faces.faceRect[i].left,
                                                            detect_faces.faceRect[i].top,
                                                            detect_faces.faceRect[i].right,
                                                            detect_faces.faceRect[i].bottom,
                                                            detect_faces.faceID[i]))
                zp_time_list[detect_faces.faceID[i]] = time.time()
                #更新下指定faceId对应的最新时间戳,主要用来给gen_zq处理线程判断哪个faceid已经没有连续记录了,是否可以收割了。

                cv2.rectangle(img, (detect_faces.faceRect[i].left, detect_faces.faceRect[i].top), (detect_faces.faceRect[i].right, detect_faces.faceRect[i].bottom), (0, 0, 255), 1)
                cv2.putText(img, str("qa:%.2f"%(confidenceLevel.value)), (detect_faces.faceRect[i].left, detect_faces.faceRect[i].top-5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)
                cv2.putText(img, "id:"+str(detect_faces.faceID[i]),(detect_faces.faceRect[i].left, detect_faces.faceRect[i].top - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5,(0, 0, 255), 1)
                #以上是根据人脸检测的结果以及置信度得分,进行画框,以及文字标注,qa为质量得分   id为faceid
        cv2.imshow("RealTimeDisplay", img)
        cv2.waitKey(1)

thread_grab_frame = threading.Thread(target=grab_frame, args=(rtsp_url, frames_q))
thread_grab_frame.start()
thread_detect_face = threading.Thread(target=detect_face, args=(frames_q,))
thread_detect_face.start()
thread_gen_zq = threading.Thread(target=gen_zq, args=())
thread_gen_zq.start()
thread_detect_face.join()
Copy the code

The code in the loop is a little bit longer, time dependent, not individually encapsulated, and the logic is not very complicated.

There are some slight flaws, and then adjust slowly when you have time. There is one small flaw.

thread_grab_frame = threading.Thread(target=grab_frame, args=(rtsp_url, frames_q))

thread_detect_face = threading.Thread(target=detect_face, args=(frames_q,))

thread_gen_zq = threading.Thread(target=gen_zq, args=())

You can according to the above three thread entry functions, respectively.

\

In order to ensure that the image queue of video frames will not increase indefinitely, a queue of limited capacity is designed here. When the image frame is inserted into the queue, if the queue is full, the oldest frame is discarded and the latest frame is inserted.

Attach the overall structure of the code file:

6. Run tests

Some minor problems and potholes have been solved and smoothed out.

Real-time display effect:

Run log screenshot:

The above are basically the same FaceId different frame picture quality score, if not heavy, there will be a lot of repetition after landing.

Storage effect after running:

We got a ton of facial information, so to keep it secret, we had to blur it.

Operating performance:

This design is basically in memory processing, only the final landing, using the disk, look at the running CPU, memory occupation.

The performance footprint is still satisfactory.

My machine is configured as i5 8th-generation, and the video stream uses fixed bit rate of 16384Kbps, 25 frames per second, 1920*1080 resolution and h.264 video encoding. If the resolution is reduced, the CPU usage will be lower.

Finally want to say a rainbow soft ox X~!

Project code

Reference: gitee.com/codetracer/…

DLL is not uploaded, please put the SDK of DLL in the root directory of the project.

To learn more about face recognition products, please visitRainbow soft visual open platformoh