Teach your Jarvis (dig yourself a big hole)

Play with the hot ai today.

Without further ado, let’s take a look at some pictures:

See what’s going on?

Here’s another video:

https://www.zhihu.com/video/1002567561061511168

(See demo site and code below) How far away is AI? Do you think the only ai that beats Kojer’s Alphago is ai? Look around you: your beauty camera, your Tiktok recommendation, the voice assistant on your phone… Even the guy who delivers food to you will now be involved in artificial intelligence.

Jarvis, the intelligent butler from Iron Man, doesn’t look so sci-fi these days. Many manufacturers have recently launched smart speaker products that can chat and control smart home devices in the home.

But I want a Jarvis of my own.

The idea has a long history. It now looks increasingly likely to happen. The various AI platforms and intelligent hardware are mature, and Python is the glue language that makes it easy to splice modules together.

Recently I just caught a glimpse of an interesting new “toy” — Tencent AI Open platform (http://ai.qq.com). I just wanted to share it with you. As a result, I couldn’t stop writing, so I took this opportunity to start my training journey. (It must be a pit this time.)

This time, I used the speech recognition, intelligent chat and speech synthesis of the AI platform to string together the three functions and realize the function of intelligent voice response.

The current function can be said to be very rudimentary, but there must be a start, anyway I have plenty of time to build it.

Some plans for the future:

Add cameras to realize face recognition and scene positioning

Attach a robotic arm. So mindless mobile games, you don’t want to be ranked with me

Install the wheel

Connected drone

And, of course, raspberry pie

I don’t know what it’s going to be, but it doesn’t matter, it’s fun enough for me. (Probably not an Ultron.)

In addition, yesterday also conveniently took the interface of face fusion to do an online dressing tool, support 50 kinds of templates, want to experience friends, from which enter the >>> face fusion – Crossin programming laboratory.

Last night put this function on line, friends circle posted once, soon hundreds of people visit. Now the article push out, I do not know that small broken server can withstand. Image transfer has been optimized for compression, but it is still dangerous. If the request fails, try again later, or download the code and run it yourself.

Amateurs watch the crowd, amateurs watch the door. That’s all, and if you want to know more, let’s get down to some informative stuff.

This code used Tencent AI open platform, now has many functions, mainly natural language processing, computer vision, intelligent voice three directions. The code for this case is all three directions.

The platform documentation is fairly detailed, and there are also online demonstrations of features, so I suggest you try it yourself. It is now possible to apply for it as long as you register, there is no charge, and there are few restrictions on learning. Unfortunately, I don’t see any Python examples, so you can use my code to write them. (Especially for calculating the signature, you can use my code directly.)

Looking at individual apis is not complicated. It’s simply a web request. You provide the correct parameters as required, and the platform returns the results. But if you’re new to this interface, I’m sure you’ll get a black eye, because I’ve been there.

There are about three obvious pits:

The signature. This is the hallmark of an open API for verifying the identity of a source and is a must for beginners to the API. You need to understand the concept of MD5 (mentioned in a previous article) and generate the correct signature as required by the API. In addition to the initial difficulty of understanding, when developing and debugging, because the final output is only a string of characters, it is difficult to debug errors, and you have to check them carefully over and over again.
Parameters. The parameters seem to be specified for you, but when you actually use them, there are all kinds of problems. A more common mistake might be coding. Also, even low-level errors can take a long time because of the lack of debugging information. Add in some documentation details (such as character limits) and errors (POST is the request method in speech composition but GET is the document) that you might not have noticed, and you can GET crazy.
The return value. When you finally get the results, you may be stunned… This is because, with the exception of natural language interfaces, the return values of images and sounds are mostly base64-encoded data. You need to process, store, or present this data yourself. If, like me, you want to connect the functions of several interfaces, you will be waiting for various data and file type conversions. Fortunately, Python is convenient for this, otherwise it would be really painful.

So what looks like a simple interface call is not easy for inexperienced developers. But you can’t tell this by looking at it. You have to write about it. If you do, you’ll find that I’ve already warned you about some of these pitfalls. Don’t worry too much about which tutorial is better, just do it yourself.

The sample code includes code for voice chat and dressing, as well as a simplified version of the dressing page (based on Django). To obtain the code address, please reply to ai in the public account (Crossin’s programming classroom)

If you have any questions you can’t figure out, feel free to join my knowledge planet and talk about them, not just sample code.

Face fusion – Crossin’s programming lab welcome to experience and forward.

═ ═ ═ ═

Other articles and Answers:

Welcome to search and follow: Crossin programming classroom

Teach your Jarvis (dig yourself a big hole)

Related Posts

The heart of Machines annual Review: Artificial intelligence research achievements of 2017

A late catch: self-playing AlphaGo Zero

My “one Neural chat Model” with TensorFlow: a deep learning-based chatbot