Software and Hardware Environment

  • Ubuntu 18.04 64 – bit
  • Anaconda with python 3.7
  • nvidia gtx 1070Ti
  • Opencv 4.2.0

preface

The algorithm model used in this paper is OpenPose, an open source library of CMU Perceptual Computing Lab that gathers human, human face, and hand Perceptual key detection, which has been previously described. In this paper, the DNN module will be used in OpenCV to call the Hand Pose Estimation model in OpenPose project to realize gesture recognition.

Based on the environment

  • There are two ways to install the NVIDIA graphics driver in Ubuntu
  • Ubuntu installs CUDA and cuDNN
  • Opencv can make CUDA
  • Openpose pose recognition

The sample code

import cv2 import time import numpy as np protoFile = "pose_deploy.prototxt" weightsFile = "pose_iter_102000.caffemodel"  nPoints = 22 POSE_PAIRS = [ [0, 1], [1, 2], [2, 3], [3, 4], [0, 5], [5, 6], [6, 7], [7, 8], [0, 9], [9], [10, 11], [11, 12], [0, 13], [13, 14], [14], [15 and 16th], [0], [17, 1 8],[18,19],[19,20] threshold = 0.2 # read built-in camera or usb camera cap = cv2.videocapture (0) hasFrame, frame = cap.read() frameWidth = frame.shape[1] frameHeight = frame.shape[0] aspect_ratio = frameWidth/frameHeight InWidth = int((aspect_ratio*inHeight)*8)//8 cv2.VideoWriter('output.avi',cv2.VideoWriter_fourcc('M','J','P','G'), 15, Net = cv2.dnn.readnetFromcaffe (protoFile, weightsFile) k = 0 while True: K +=1 t = time.time() # if not hasFrame, frame = cap.read() frame = np.copy() InpBlob = cv2.dnn.blobFromImage(frame, 1.0/255, (inWidth, inHeight), (0, 0, 0), swapRB=False, crop=False) net.setinput (inpBlob) The 22nd dot represents the background output = net.forward() print("forward = {}".format(time.time() -t)) # Empty list to store the detected keypoints points = [] for i in range(nPoints): = output[0, I, :, :] probMap = cv2.resize(probMap, (frameWidth, frameHeight)) point = cv2.minMaxLoc(probMap) if prob > threshold : cv2.circle(frameCopy, (int(point[0]), int(point[1])), 6, (0, 255, 255), thickness=-1, lineType=cv2.FILLED) cv2.putText(frameCopy, "{}".format(i), (int(point[0]), int(point[1])), cv2.FONT_HERSHEY_SIMPLEX, .8, (0, 0, 255), 2, lineType=cv2.LINE_AA) points.append((int(point[0]), int(point[1]))) else : Points. Append (None) # draw key points for pairs in POSE_PAIRS: pairs [0] partB = pairs [1] if points[partA] and points[partB]: cv2.line(frame, points[partA], points[partB], (0, 255, 255), 2, lineType=cv2.LINE_AA) cv2.circle(frame, points[partA], 5, (0, 0, 255), thickness=-1, lineType=cv2.FILLED) cv2.circle(frame, points[partB], 5, (0, 0, 255), thickness=-1, lineType=cv2.FILLED) print("Time Taken for frame = {}".format(time.time() - t)) cv2.imshow('webcam', WaitKey (1) if key == 27: break print("total = {}".format(time.time() - t)) vid_writer.write(frame) vid_writer.release()Copy the code

Run the above code to use the local USB camera for gesture detection

Model download

Baidu network backup link: https://pan.baidu.com/s/17QGpualKBdtl4uvbYzIWLg, to extract the code: 3 LJN