Recently, I investigated the intelligent voice service AIUI of IFLYTEK, and developed a relatively simple Demo (Android->AIUI-> server post-processing) based on the official Demo to experience the functions.

The official Android Demo provides “voice dictation”, “syntax recognition”, “semantic understanding”, “speech Composition”, “Voice print password” and other functions. I mainly use “semantic understanding” and “speech composition”.

AIUI should try to do something similar with Amazon’s Alexa, but currently AIUI’s functionality is limited and the development documentation is not very good (a large section describes the operation of the web console, and many key areas are not clearly described, such as “custom skills”, “custom entities” and other features that developers care about very much. The development documentation only briefly describes how to add; “Custom entities” is over without saying anything); In addition, the official DEMO functions are simple, especially the third party application developers need to use the “post-processing” function, the information is very little, the official only gave a simple can not be simpler example (just implemented AIUI to forward the semantic analysis results to the third party post-processing, post-processing received there is no… I was speechless when I saw it… Post-processing Once received, how to process the data, what format to return the data in, and how to use the returned data is not mentioned.

The current scenario for AIUI

Voice guidance: for example, the general function of “weather” in open skills does not need to be associated with user information when querying, and there are predictable keywords such as “weather” (so-called “custom entities”).





Screen Snapshot 2017-09-20 PM hw21.png

AIUI can’t do that

The case that the corpus cannot be set in advance, such as the voice information input associated with the account (a consultation APP, user name input, etc.)

Research the features you want to implement

  1. Voice guidance: For self-service terminals such as ATM, users interact with terminals through voice. For example, the terminal asks “Hello, what do you need”, the user answers “take photos”, and then the terminal enters the photo function;

AIUI can meet this requirement at present. How to achieve it will be introduced later in the article

  1. Voice information input: through the user’s voice to complete the information input, such as the user’s name, age, disease, etc.;

At present AIUI is not suitable for this scenario, the pronunciation is the same word is not the same situation too much, such as wangxiaoer corresponding name may be “wangxiaoer”, “wangxiaoer”, in this case, the use of voice information input, but the loss is not worth the gain.

Custom skills





image.png

You need to implement custom skills when there is no desired feature in open skills, extract keywords through semantic understanding, and perform some operations in a “post-processing” server.

For example, in the custom skill, the corpus “Weishang” is set. When the user says “Weishang”, AIUI will trigger this custom skill, in which the query is “Weishang”. After receiving this semantic, the post-processing service can do some operations to return data.

{" category ":" ISHANG. MHealth_demo: 11.0 ", "intentType" : "custom", "query" : "I is", "query_ws" : "I/NP / / still/ADD / /", "rc" : 0, "nlis": "true", "service": "ISHANG.mHealth_demo", "uuid": "atn000167dc@ch60f10d1d86686f2601", "vendor": "ISHANG", "version" : "11.0", "semantic" : [{" intent ":" init ", "score" : 1, "slots" : []}], "sid" : "atn000167dc@ch60f10d1d86686f2601", "text": "I still"}Copy the code

Note here the rc field, where 0 indicates semantic understanding success, or if semantic understanding is unsuccessful (rc = 4) :

{
    "rc": 4,
    "uuid": "atn00016bf3@ch60f10d1d86f86f2601",
    "sid": "atn00016bf3@ch60f10d1d86f86f2601",
    "text": "大王"
}
Copy the code

Custom entities

If you use “I am {name}”, the {name} is a custom entity (which can be understood as a corpus). The open custom entity contains province, city, song name, etc. If not, a custom” Zhang SAN “will appear in the semantic slot during semantic understanding





image.png

{"category": "ishang. mHealth_demo:11.0", "intentType": "custom", "query": "I am John ", "query_ws": "I'm/NP / / / V_SHI / / zhang SAN/NPP / /", "rc" : 0, "nlis" : "true", "service" : "ISHANG. MHealth_demo", "uuid" : "atn00016f72@ch0de90d1d87956f2a01", "vendor": "ISHANG", "version": "11.0", "semantic": [{"intent": "Input_name", "score" : 1, "slots" : [{" begin ": 2," the end ": 4," name ":" name ", "normValue" : "zhang", "value" : }}], "sid": "atn00016f72@ch0de90d1d87956f2a01", "sid": "atn00016f72@ch0de90d1d87956f2a01"}Copy the code

Notice that the semantic[0] has already been added to the custom entity. It appears in json in the semantic[0]. Slots [0] field, which is the essence of semantic understanding. Used as a business logic parameter in post-processing

However, corpus must be input well in advance, otherwise semantic understanding will fail; But in some cases — such as name entry — semantic understanding fails unless the names of the entire Chinese population are made into a corpus.





image.png

If it is an unadded entity, return the following, rc 4, meaning the semantics are not understood





image.png

{"rc": 4, "uuid": "atn000176af@ch0de90d1d88736f2a01", "sid": "atn000176af@ch0de90d1d88736f2a01", "text": "I am an example"}Copy the code

post-processing

If post-processing is set, AIUI server will forward the result of semantic understanding to the post-processing server. In the function (or method) that the post-processing server receives the request forwarded by AIUI through POST method, we can extract the result of semantic understanding, do some queries and other operations, and then return;

Post requested data

The key data is stored in the msg. Content field. It should be noted here that SessionParams and msG. Content are encoded after Base64 and need to be decoded before use. The complete request data after decoding is as follows:

{"MsgId":"cid6f1c2494@ch00270d1c09d20100101","CreateTime":1505803732,"AppId":"59bf6334","UserId":"d3146084944","SessionP Arams ": {" DSRC" : "the SDK" and "DTS" : "1", "dtype" : "audio", "MSC. Lat" : "39.895252", "MSC. LNG" : "116.343834", "scene", "main", "scity" : "ch", "sid":"cid6f1c2494@ch00270d1c09d2010010","stmid":"audio-16","ver_type":"mobile_phone","wake_id":"15058037304161d1c41c87a b7cd3c"},"UserParams":"","FromSub":"kc","Msg":{"ContentType":"json","Type":"text","Content":{"intent":{"data":{"result": [{40, "airData" : "airQuality" : "optimal", "city" : "Beijing", "date" : "2017-09-19", "dateLong:" 1505750400, "exp" : {" ct ": {" expName" : "dressing index", "Lev El ":" hot ","prompt":" Hot weather, suggested short skirts, shorts, short thin coats, T-shirts and other summer clothing. "}},"humidity":"20%","lastUpdateTime":"2017-09-19 11:39:20 pm25 ", "" :" 13 ", "temp" : 29, "tempRange 14 ℃ ~ 30 ℃" : ""," weather ", "qing", "weatherType" : 0, "wind" : "northwest wind 3-4", "windLevel" : 1}, {" ci Ty ":" Beijing ", "date" : "2017-09-20", "dateLong" : 1505836800, "lastUpdateTime" : "2017-09-19 11:39:20 tempRange ", "14 ℃ ~ 27 ℃" : ""," weather ", "qing", "weatherType" : 0, "wind" : "south wind", "windLevel" : 0}, {" city ":" Beijing ", "date" : "the 2017-09 - 21","dateLong":1505923200,"lastUpdateTime":"2017-09-19 11:39:20 ", "tempRange 17 ℃ ~ 28 ℃" : ""," weather ":" cloudy ", "weatherType" : 1, "wind" : "south wind", "windLevel" : 0}, {" city ":" Beijing ", "date" : "the 2017-09 -22","dateLong":1506009600,"lastUpdateTime":"2017-09-19 11:39:20 tempRange ", "15 ℃ ~ 28 ℃" : ""," weather ":" clear ", "weatherType" : 0, "wind", "" :" northwest wind breeze windLevel ": 0}, {" city" : "Beijing", "date" : "the 2017-09 -23","dateLong":1506096000,"lastUpdateTime":"2017-09-19 11:39:20 ", "tempRange 18 ℃ ~ 29 ℃" : ""," weather ":" clear to overcast, "" weatherType" : 0, "wind" : "south wind", "windLevel" : 0}, {" city ":" Beijing ", "date" : "2017 - 09-24","dateLong":1506182400,"lastUpdateTime":"2017-09-19 11:39:20 ", "tempRange 19 ℃ ~ 28 ℃" : ""," weather ", "Yin" and "weatherType" : 2, "wind", "east wind", "windLevel" : 0}, {" city ":" Beijing ", "date" : "the 2017-09 - 25","dateLong":1506268800,"lastUpdateTime":"2017-09-19 11:39:20 ", "tempRange 19 ℃ ~ 28 ℃" : ""," the weather turn ":" cloudy ", "weatherType" : 1, "wind" : "southeast wind breeze", "windLevel" : 0}}], "rc" : 0, "semantic" : [{" in tent":"QUERY","slots":[{"name":"location.city","value":"CURRENT_CITY","normValue":"CURRENT_CITY"},{"name":"location.poi" ,"value":"CURRENT_POI","normValue":"CURRENT_POI"},{"name":"location.type","value":"LOC_POI","normValue":"LOC_POI"},{"nam E ":" queryType ", "value" : "content"}, {" name ":" subfocus ", "value" : "the weather condition"}]}], "the service" : "the weather", "text" : "the weather", "uuid" : "atn00913a37 @ c H46b50d1c09d46f2a01 used_state ", "" : {" state_key" : fg: : weather: : default: : "default", "state" : "default"}, "answer" : {" text ", "\" in Beijing \ "today \" or \ ", \ "14 ℃ ~ 30 ℃ \" and \ "mistral 3-4 \" "}, "dialog_stat" : "DataValid", "sid" : "cid6f1c2494 @ ch00270d1c09d2010010"}}}}Copy the code

Returns the format

Refer to json data returned by an open skill, such as “weather”, and place the result in the answer of the data or intent. The other fields are still sent by post. The data after the semantic understanding of “weather” is as follows:

{" data ": {" result" : [{" airData ": 44," airQuality ":" optimal ", "city" : "Beijing", "date" : "2017-09-20", "dateLong" : 1505836800, "exp": {"ct": {"expName": {"level": "hot ", "prompt":" hot weather, recommended short skirts, shorts, thin jackets, T-shirts and other summer clothing "}}, "humidity": "25%", "lastUpdateTime" : "the 2017-09-20 11:07:03", "pm25" : "10", "temp" : 24, "tempRange" : "14 ℃ ~ 27 ℃", "weather" : "Qing", "weatherType" : 0, "wind" : "the north wind breeze", "windLevel" : 0}, {" city ":" Beijing ", "date" : "2017-09-21", "dateLong" : 1505923200, "lastUpdateTime": "2017-09-20 11:07:03", "tempRange": "17℃~29℃", "Weather ":" cloudy ", "weatherType": 1, "wind": {"city": "Beijing ", "date": "2017-09-22", "dateLong": 1506009600, "lastUpdateTime": "The 2017-09-20 11:07:03", "tempRange" : "13 ℃ ~ 27 ℃", "weather" : "clear", "weatherType" : 0, "wind" : "northwest wind", "windLevel" : }, {"city": "Beijing ", "date": "2017-09-23", "dateLong": 1506096000, "lastUpdateTime": "The 2017-09-20 11:07:03", "tempRange" : "18 ℃ ~ 29 ℃", "weather" : "clear to overcast," "weatherType" : 0, "wind" : "south wind", "windLevel" : }, {"city": "Beijing ", "date": "2017-09-24", "dateLong": 1506182400, "lastUpdateTime": "The 2017-09-20 11:07:03", "tempRange" : "19 ℃ ~ 28 ℃", "weather" : "Yin", "weatherType" : 2, "wind" : "east breeze", "windLevel" : 0}, {"city": "Beijing ", "date": "2017-09-25", "dateLong": 1506268800, "lastUpdateTime": "The 2017-09-20 11:07:03", "tempRange" : "19 ℃ ~ 28 ℃", "weather" : "cloudy turn Yin", "weatherType" : 1, "wind" : "the southeast wind breeze", "windLevel" : }, {"city": "Beijing ", "date": "2017-09-26", "dateLong": 1506355200, "lastUpdateTime": "The 2017-09-20 11:07:03", "tempRange" : "13 ℃ ~ 25 ℃", "weather" : "clear", "weatherType" : 0, "wind" : "northwest wind 3-4", "windLevel" : 1 } ] }, "rc": 0, "semantic": [ { "intent": "QUERY", "slots": [ { "name": "location.city", "value": "CURRENT_CITY", "normValue": "CURRENT_CITY" }, { "name": "location.poi", "value": "CURRENT_POI", "normValue": "CURRENT_POI" }, { "name": "location.type", "value": "LOC_POI", "normValue": "LOC_POI" }, { "name": "QueryType", "value" : "content"}, {" name ":" subfocus ", "value" : "the weather condition"}]}], "the service" : "the weather", "text" : "Weather ", "uuid": "atn00018593@ch60f10d1d8a556f2601", "used_state": {"state_key": "fg::weather::default::default", "state": "default" }, "answer": { "text": "\" Beijing \ "today \" or \ ", \ "14 ℃ ~ 27 ℃ \", \ "\" "}, the north wind breeze "dialog_stat" : "DataValid", "sid" : "atn00018593 @ ch60f10d1d8a556f2601}"Copy the code

DEMO

Demo address. Demo includes the post-processing server and Android App

Post-processing server

Nodejs implementation

  • In the method of GET request processing, aiUI post-processing server verification is mainly implemented
  • Post request processing method, implemented a very simple state machine, using aiUI sent semantic results combined with a code variable, to control what data to return; Return data format reference open skill “weather”

Android App

Completed in official demo (once you register your Android application on AIUI, you can download the project code on the Settings screen of the application. Unfortunately, this is an Eclipse project.

  • In the semantic understanding Demo, based on the speech synthesis function, added the voice broadcast result function
  • Say “I shang”, return to “Welcome to use… Please state your name “, then say “Zhang SAN”, return “your name is Zhang SAN, please state your age”; Then say “28” and return “Your age is 28, thank you for using, goodbye”.

Configuration of this demo on AIUI





Application configuration





Custom skill init





User-defined skill input_name





Custom entities