The drawing software of Windows accompanied me in the first few years when I was in touch with computers. This simple little tool was magical to me when I was young, as if I could draw anything. I drew a lot of people and fantasized about having them move with me (flash wasn’t known at the time). Some time ago, I realized a small project to make graffiti move with flying paddles. Today, I would like to share with you my idea when I was young — to make graffiti figures move with people.
MetaAI released the project
Key point detection
To get the doodler to do what I do, first we need a human keypoint detection model. PaddleHub, a pre-training model application tool, and PaddleDetection, a target detection kit, are available for human key point detection. This project uses the Human_POSE_ESTIMation_RESNET50_MPII model from PaddleHub. This model is faster than OpenPose, but less effective. Openpose_body_estimation can be used if the accuracy of doodleman actions is very high, or if you want the model to return the confidence information of key point coordinates. Or the HRNet series models in PaddleDetection and the latest PP-TinyPose.
PaddleDetection link: github.com/PaddlePaddl… PaddleHub link: github.com/PaddlePaddl… Since every point in the doodle is bound to the bone point, there are not too many bone points, otherwise points that are not close to the bone point will be bound together for some special reason. We need to expand the detected bone point K to a certain extent. This project calculates the midpoint between every two adjacent bone points as the expanded bone point, and then repeats this operation twice. Human_pose_estimation_resnet50_mpii The training data set for this model is MPII, with 16 key points. Let’s start by creating a pivot point between “thorax” and “pelvis” as the root of all the pivot points. And build a tree of key points for our entire human body. This tree structure will be used later. Finally, the number of extended key points is 65, and the extended key point group is denoted as K ‘. Encapsulate the key point detection model:
Class estUtil(): # encapsulates the keypoint detection class def __init__(self): super(estUtil, Self).__init__() # Use human_pose_ESTIMation_resnet50_mPIi model self.module = hub.Module(name='human_pose_estimation_resnet50_mpii') def do_est(self, frame): res = self.module.keypoint_detection(images=[frame], use_gpu=True) return res[0]['data']Copy the code
Ways to extend key points:
Def complexres(res, FatherAndSon): # expand key points, but still keep the logical node order of key points cres = copy.deepCopy (res) for key,pos in res.items(): Father = FatherAndSon[key] if father == father: # continue if key[0] == 'm' or father[0] == 'm': Midkey = 'm'+key+'_'+father else: kn = '' for t in key.split('_'): kn += t[0] fn = '' for t in father.split('_'): Fn + = t [0] # child nodes of the second kind of naming rules midkey = 'm_' + kn + + fn # calculating midpoint, '_' Midvalue = [(pos[0] + res[father][0]) / 2 (pos[1] + res[father][1])/2] FatherAndSon[key] = midkey FatherAndSon[midkey] = father cres[midkey] = midvalue return cres, FatherAndSonCopy the code
The left is the key point k before the extension, and the right is the key point k ‘after the extension.
Graffiti recording and optimization
To facilitate interaction, I used OpenCV to create a simple little sketchpad where users can choose different colors to draw doodles. In order to bind the key points and graffiti in the future, I first drew the human key points of the template on the canvas, so that users could have a set of reference coordinates and draw their graffiti figures more conveniently.
OpenCV continuously collects the current position of our brush (mouse) (Mouse_x,Mouse_y) after the user presses the mouse. When the user releases the mouse, the program stops recording. Connecting the points we just recorded in turn is the track we just clicked and moved with the mouse. A painting can be done with one stroke, or with multiple strokes and colors. Let’s call this doodler B. Since OpenCV collects our brush positions at a certain frame rate, if we draw a line of the same length slowly, we will sample more points. If we draw fast, we’ll sample fewer points. In the course of subsequent use, because a lot of the calculation of relative relationship of sampling points and bone (through the evaluation, this part of the time could be far greater than the model of operation time, become the bottleneck of the smooth run), so here we want to filter the sampling point B, here I use the most intuitive filtering methods, namely when the three points on the same line, Filter out the middle point, leaving only the two endpoints. This way, the number of points for a simple doodler can be reduced from thousands to dozens, and projects run more smoothly. Here we call the filtered sample point B prime. Methods for filtering simplified skin data:
def linesFilter(): global lines for line in lines: linelen = len(line) sindex = 0 mindex = 1 while mindex < len(line): eindex = mindex + 1 if eindex >= len(line): break d1 = line[mindex][0] - line[sindex][0] d2 = line[mindex][1] - line[sindex][1] d3 = line[eindex][0] - Line [sindex][0] d4 = line[eindex][1] -line [sindex][1] line.pop(mindex) else: sindex += 1 mindex += 1 def linesCompose(): Global lines tlines = [] for lines in lines: tlines.append([line[0]]) for i in range(1,len(line)): l_1 = tlines[-1][-1] tlines[-1].append(((l_1[0] + line[i][0]) / 2,(l_1[1] + line[i][1]) / 2)) tlines[-1].append((line[i])) lines = tlinesCopy the code
Binding of key points to doodles
Anchor binding number: Before the animation starts, it is very important to bind our sampled skin B ‘to the expanded key point group K’. From the above description, we know that skin B ‘is actually one point after another, and this process is the binding between skin points and key points. To be more professional, we need to select their Anchor points for skin points, which are all from bone key points. In this project, each skin point is bound to four key points at most, which is related to the number of our key point K ‘. When our key point K ‘is dense enough, the number of our anchor points can be reduced.
Anchor point binding standard: The measurement standard of selected anchor points here is distance, that is, m key points closest to skin point N are selected. This method has drawbacks. For example, in the example of the beard, the anchor points on our shoulders are actually closer than some of the points on our face because we chose the nearest key point, which causes the beard to follow our shoulders. If you want to match the anchor points more accurately, you can also manually intervene in the process by removing some of the irrational bindings described above.
The left is the doodle we drew, and the right is the binding effect of doodle and key points
Bind skin data and bone data:
def buildskin(lines, colors, cirRads, nodes): if lines is None or nodes is None or len(lines) == 0 or len(nodes) == 0: return [] skins = [] print("doodle node length", Len (Nodes) # encapsulate the list of skin points obtained by OpencV into the list of objects of the skinItem class for lineIndex in range(len(lines)): init = True line = lines[lineindex] color = colors[lineindex] cirRad = cirRads[lineindex] for p in line: if init: skins.append(skinItem(p[0], p[1], True, color, cirRad)) init = False else: Skin.append (skinItem(p[0], p[1], False, color, cirRad)) # md = [float("inf"), float("inf"), float("inf"), float("inf")] mn = [None, None, None, None] mdlen = 0 for key,node in nodes.items(): d = distance(skin.getPos(), node.getPos()) maxi = judge(md) if d < md[maxi]: md[maxi] = d mn[maxi] = node mdlen += 1 if mdlen < 4: Md = md[:mdlen] mn = mn[:mdlen] ws = dist2weight(md) # th = math.atan2(skin.y-mn[j].y, skin.x-mn[j].x) r = distance(skin.getPos(), mn[j].getPos()) w = ws[j] skin.appendAnchor(anchorItem(mn[j], th-mn[j].thabs, r, w)) return skinsCopy the code
Graffiti Updates
After we have done the initialization of the previous step, we can calculate the new position of the skin point in each subsequent frame. During the previous binding, we also recorded some other information about the anchor point: the distance and Angle of the skin point from the anchor point. After obtaining the four anchors, we also calculate an initial weight, α. In this way, when the position of our key point changes, we can calculate the new position S “of a weighted skin point based on the new position of the anchor point. We drew the skin in S “order and finished the project.
Calculate new skin points based on new bones in each frame:
def calculateSkin(skins, scale): for skin in skins: Xw = 0 yw = 0 # According to the coordinates and angles of each anchor point of skin point, calculate the coordinates of the new skin point for anchor in skin.getanchor (): x = anchor.node.x + math.cos(anchor.th+anchor.node.thabs) * anchor.r * scale y = anchor.node.y + math.sin(anchor.th+anchor.node.thabs) * anchor.r * scale xw += x * anchor.w yw += y * anchor.w skin.x = xw skin.y = yw return skinsCopy the code
Some current problems and improvement direction
1) Problem human_pose_ESTIMation_RESNET50_MPII a big disadvantage of this model is that there is no output of joint confidence, so there is no way to filter the results. If incomplete human body image is input, the model will still output 16 key points, among which the key points that should not be in the image will also exist. These false result points will lead to the phenomenon that skin points are also drawn incorrectly. In addition, for better effect, the input video had better have less background, reduce some influence factors. 2) Improvement direction
- You can try using other keypoint detection models in PaddleDetection. It is worth noting that if you are using a model based on a COCO dataset, you will need to change the Doodle file.
- For a smoother experience, try putting the keypoint detection model in another thread.
I also live share this project in the column of Flying Paddles Developer, welcome you to go to B station to watch the video sharing.
Share the video: www.bilibili.com/video/BV1N3… ? Spm_id_from = 333.999.0.0 project link: aistudio.baidu.com/aistudio/pr… PaddleDetection:github.com/PaddlePaddl… PaddleHub:github.com/PaddlePaddl…