@[TOC]
preface
Originally did not want to write this, but turned a circle of domestic mediapipe less tutorial. It’s not that comprehensive, so let’s make a note here.
Environmental installation
If you are Anconada then you do not need to install, but if not, you just need to enter the following command
pip install mediapipe
Copy the code
Again before you must master Python3 Pyharm using OpencV basic use
Quick start (gesture capture)
Here we have something that’s going to rot on the street.
import mediapipe as mp
import cv2
cap = cv2.VideoCapture(0)
mpHand = mp.solutions.hands # MP hand capture
Hand = mpHand.Hands() # Find the hand in the picture
mphanddraw = mp.solutions.drawing_utils # draw tool
while True:
flag,img = cap.read()
RGBImage = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) # Convert the picture
result = Hand.process(RGBImage)
if(result.multi_hand_landmarks): # If there are hands, then you will get a list of hands to record the coordinates of the hands
for handlist in result.multi_hand_landmarks:
mphanddraw.draw_landmarks(img,handlist,mpHand.HAND_CONNECTIONS)
HAND_CONNECTIONS Connects the dots
cv2.imshow("Hands",img)
if(cv2.waitKey(1) = =ord("q")) :break
cap.release()
cv2.destroyAllWindows()
Copy the code
Comments should be very clear, here is not elaborated.
Gets the coordinates of the hand
As you can see from the previous image, the handList contains the complete coordinates of one of our hands and can draw 21 points. So the handlist is actually the number and coordinates of 21 points (the coordinates are measured in percentage), each point corresponds to the figure below
So we can capture the state of our hands very clearly. Now let’s redraw our hand.
import mediapipe as mp
import cv2
cap = cv2.VideoCapture(0)
mpHand = mp.solutions.hands # MP hand capture
Hand = mpHand.Hands() # Find the hand in the picture
mphanddraw = mp.solutions.drawing_utils # draw tool
while True:
flag,img = cap.read()
RGBImage = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) # Convert the picture
result = Hand.process(RGBImage)
if(result.multi_hand_landmarks): # If there are hands, then you will get a list of hands to record the coordinates of the hands
for handlist in result.multi_hand_landmarks:
for id,lm in enumerate(handlist.landmark):
h,w,c = img.shape
cx,cy = int(lm.x * w),int(lm.y * h)
cv2.circle(img,(cx,cy),15, (255.0.255),cv2.FILLED)
mphanddraw.draw_landmarks(img,handlist,mpHand.HAND_CONNECTIONS,)
HAND_CONNECTIONS Connects the dots
cv2.imshow("Hands",img)
if(cv2.waitKey(1) = =ord("q")) :break
cap.release()
cv2.destroyAllWindows()
Copy the code
And then we can optimize the code, and we know that we can directly capture the coordinates of every joint of a hand, on our picture, so that we can predict and judge the posture of our hand. Now it’s fun. For example:
import mediapipe as mp
import cv2
import math
cap = cv2.VideoCapture(0)
cap.set(3.1280)
cap.set(4.720)
mpHand = mp.solutions.hands # MP hand capture
Hand = mpHand.Hands() # Find the hand in the picture
mphanddraw = mp.solutions.drawing_utils # draw tool
while True:
flag,img = cap.read()
RGBImage = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) # Convert the picture
result = Hand.process(RGBImage)
if(result.multi_hand_landmarks): # If there are hands, then you will get a list of hands to record the coordinates of the hands
hands_data = result.multi_hand_landmarks
for handlist in hands_data:
h, w, c = img.shape
shizhi_postion = (int(handlist.landmark[8].x*w),int(handlist.landmark[8].y*h))
muzhi_postion = (int(handlist.landmark[4].x * w), int(handlist.landmark[4].y * h))
cv2.line(img, muzhi_postion, shizhi_postion, (255.0.0), 5)
location = int(shizhi_postion[0]-muzhi_postion[0]) * *2\
+ int(shizhi_postion[1]-muzhi_postion[1]) * *2
location = int(math.sqrt(location))
showpostion = (int(((muzhi_postion[0]+shizhi_postion[0) /2)),int(((muzhi_postion[1]+shizhi_postion[1) /2)))
cv2.putText(img, str(location), showpostion, cv2.FONT_HERSHEY_PLAIN, 1, (255.0.255), 1)
for id,lm in enumerate(handlist.landmark):
cx,cy = int(lm.x * w),int(lm.y * h)
# We mark the distance between thumb and index finger
cv2.circle(img,(cx,cy),15, (255.0.255),cv2.FILLED)
cv2.putText(img,str(id),(cx,cy),cv2.FONT_HERSHEY_PLAIN,1, (0.255.255),1)
mphanddraw.draw_landmarks(img,handlist,mpHand.HAND_CONNECTIONS,)
HAND_CONNECTIONS Connects the dots
cv2.imshow("Hands",img)
if(cv2.waitKey(1) = =ord("q")) :break
cap.release()
cv2.destroyAllWindows()
Copy the code
So it’s fun.
Parameter description is displayed
This return parameter is very important. Look at the previous example. First hands_data is the total coordinates of the hands, how many hands are on the screen, so len() is the number
A handlist is a list of 21 points containing a hand
Handlist contains [{point 1},{point 2}…] So getting the thumb header is handlist.landmark[4]. X * w so it’s ok.
Different operator
I can’t really say what it’s called, but I’m going to call it operator. So what does this do? It’s actually using different algorithmic models to help us extract different features for processing.
And you can see that there are a lot of different operators here. The invocation is similar, but the processing may be slightly different. Of course, let’s talk about how to play hands first. We’ll talk about other things later. At the end of this series, maybe we can make a gesture system to control the computer with our body posture. For example, we can use our hand as our mouse (right hand) and so on. Or if we combine the VR game driver, we will not need the gamepad, only need a good pixel camera, and since Mediapipe runs directly in the CPU, it means that we can run on devices without GPU graphics card. Google has officially said that this device can be used in mobile terminals, Linux, etc.
Ok, with the previous code, we can actually make a simple volume controller, we just need to convert the finger spacing. That’s still not our topic, but our topic today is actually how to recognize our gestures, like 1 to 5.
Gesture Recognition cases
OK, so let’s move on to our case, and this case is actuallyOpencv Quick to Use (Basic Usage & Gesture Recognition) 的 Part, at that time this part is direct copy, the result discovers this elder brother is actually copy me to send that video code (ridicule a wave to write true lousy, have very serious OOM problem. So I’m going to write one myself, it’s fun anyway.)
Finger state judgment
First of all, it is not difficult to find the fact that, in fact, the comparison between the official picture of the finger and the picture of the finger in front of me. All we need to do is compare the coordinates of the joints to see if the hand is up.
Pretty intuitive. But there’s another problem, and that’s the thumb problem. The thumb is too short to do this directly, so we need to use the x coordinate to determine our hand. But then there’s the problem, which is that our left and right hands are constructed a little differently.So we sometimes have to decide if our hand is left or right
And the way to tell if you’re left or right is pretty simple, you just look at 1, 5 X’s relative to each other. But there is a problem with the back of the hand and the back of the hand but the projection of the back of the hand is the same as the projection of the left hand.
coding
import mediapipe as mp
import cv2
import math
cap = cv2.VideoCapture(0)
mpHand = mp.solutions.hands # MP hand capture
Hand = mpHand.Hands() # Find the hand in the picture
mphanddraw = mp.solutions.drawing_utils # draw tool
TipsId = [4.8.12.16.20] # Coordinates of the fixed point
while True:
flag,img = cap.read()
RGBImage = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) # Convert the picture
result = Hand.process(RGBImage)
if(result.multi_hand_landmarks): # If there are hands, then you will get a list of hands to record the coordinates of the hands
hands_data = result.multi_hand_landmarks
for handlist in hands_data:
h, w, c = img.shape
fingers = []
# Judge the thumb switch
if(handlist.landmark[TipsId[0] -3].x < handlist.landmark[TipsId[0] +1].x):
if handlist.landmark[TipsId[0]].x > handlist.landmark[TipsId[0] -1].x:
fingers.append(0)
else:
fingers.append(1)
else:
if handlist.landmark[TipsId[0]].x < handlist.landmark[TipsId[0] - 1].x:
fingers.append(0)
else:
fingers.append(1)
# Judge other fingers
for id in range(1.5) :if(handlist.landmark[TipsId[id]].y > handlist.landmark[TipsId[id] -2].y):
fingers.append(0)
else:
fingers.append(1)
Get the number of fingers
totoalfingle = fingers.count(1)
cv2.putText(img,str(totoalfingle),(50.50),cv2.FONT_HERSHEY_PLAIN,
5, (255.255.255),5)
# this is just for drawing finger joints, you can ignore this code
for id,lm in enumerate(handlist.landmark):
cx,cy = int(lm.x * w),int(lm.y * h)
cv2.circle(img,(cx,cy),15, (255.0.255),cv2.FILLED)
cv2.putText(img,str(id),(cx,cy),cv2.FONT_HERSHEY_PLAIN,1, (0.255.255),1)
mphanddraw.draw_landmarks(img,handlist,mpHand.HAND_CONNECTIONS,)
cv2.imshow("Hands",img)
if(cv2.waitKey(1) = =ord("q")) :break
cap.release()
cv2.destroyAllWindows()
Copy the code
The effect
Upgraded version (Christmas Vader)
We should have finished at this point, but then I remembered that we could actually not only display text on top of our text, but also overlay images,SO MAYBE IT CAN WORK SOME INTERESTING THINGS SUCH AS CONFESS TO SOMEBODY WHICH I CAN NOT USE IT RIGHT NOW!FUCK, call me the dog food machine. Thank you! The effectcode
import mediapipe as mp
import cv2
import os
cap = cv2.VideoCapture(0)
cap.set(3.1280)
cap.set(4.720)
mpHand = mp.solutions.hands # MP hand capture
Hand = mpHand.Hands() # Find the hand in the picture
mphanddraw = mp.solutions.drawing_utils # draw tool
MediaPath = "Media"
picsdir = os.listdir(MediaPath)
pics = []
for pic in picsdir:
img = cv2.imread(f"{MediaPath}/{pic}")
pics.append(img)
TipsId = [4.8.12.16.20] # Coordinates of the fixed point
while True:
flag,img = cap.read()
RGBImage = cv2.cvtColor(img,cv2.COLOR_BGR2RGB) # Convert the picture
result = Hand.process(RGBImage)
if(result.multi_hand_landmarks): # If there are hands, then you will get a list of hands to record the coordinates of the hands
hands_data = result.multi_hand_landmarks
for handlist in hands_data:
h, w, c = img.shape
fingers = []
# Judge the thumb switch
if(handlist.landmark[TipsId[0] -2].x < handlist.landmark[TipsId[0] +1].x):
if handlist.landmark[TipsId[0]].x > handlist.landmark[TipsId[0] -1].x:
fingers.append(0)
else:
fingers.append(1)
else:
if handlist.landmark[TipsId[0]].x < handlist.landmark[TipsId[0] - 1].x:
fingers.append(0)
else:
fingers.append(1)
# Judge other fingers
for id in range(1.5) :if(handlist.landmark[TipsId[id]].y > handlist.landmark[TipsId[id] -2].y):
fingers.append(0)
else:
fingers.append(1)
# Get the number of fingers and draw the picture
totoalfingle = fingers.count(1)
coverpic = pics[totoalfingle-1]
hc, wc, cc = coverpic.shape
img[0:wc,0:hc] = coverpic
# this is just for drawing finger joints, you can ignore this code
for id,lm in enumerate(handlist.landmark):
cx,cy = int(lm.x * w),int(lm.y * h)
cv2.circle(img,(cx,cy),15, (255.0.255),cv2.FILLED)
mphanddraw.draw_landmarks(img,handlist,mpHand.HAND_CONNECTIONS,)
cv2.imshow("Hands",img)
if(cv2.waitKey(1) = =ord("q")) :break
cap.release()
cv2.destroyAllWindows()
Copy the code
Picture themselves ready to go (if someone vindicate success remember to kick me)
conclusion
The most basic use is actually such, the back you want to how architecture, how architecture, this does not matter, according to the template to set it.