The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_178

We’re all familiar with the concept of ChatRobot. Maybe you’ve flirted with Siri in your boredom, or laughed with xiao Ai in your spare time. In any case, we have to admit that artificial intelligence has penetrated our lives. At present, there are many robots on the market that provide three-party API: Microsoft Xiaoice, Turing Robot, Tencent Chat, Qingyunke robot and so on. As long as we want, we can access it on the APP end or web application at any time. But how do these applications work at the bottom? Without Internet access, can we communicate with humans using only a locally stored mind-sphere, as depicted in the TV series Westworld? If you’re not content to be more than a switchman, please follow our journey, This time, we will use the deep learning library Keras/TensorFlow for the first time to build our own local chatbot, independent of any three-party interface and network.

Install the dependencies first:

pip3 install Tensorflow  
pip3 install Keras  
pip3 install nltk
Copy the code

Then write the script test_bot.py to import the required library:

import nltk  
import ssl  
from nltk.stem.lancaster import LancasterStemmer  
stemmer = LancasterStemmer()  
  
import numpy as np  
from keras.models import Sequential  
from keras.layers import Dense, Activation, Dropout  
from keras.optimizers import SGD  
import pandas as pd  
import pickle  
import random
Copy the code

NLTK, the natural language analysis library, will report an error:



Resource punkt not found


Copy the code

Normally, just add a line of the downloader code

import nltk  
nltk.download('punkt')
Copy the code

However, due to academic Internet access, it is difficult to download it normally through the Python loader, so we play a curve to save the country and manually download the compressed package:

https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/tokenizers/punkt.zip
Copy the code

Unzip it and place it in your user directory:

C:\Users\liuyue\tokenizers\nltk_data\punkt
Copy the code

Ok, without further discussion, one of the main challenges in developing chatbots is sorting user input and being able to recognize the correct human intentions (this can be solved with machine learning, but it’s too complicated, so I got lazy, so I used deep learning Keras). The second is how to maintain the context, that is, to analyze and track the context. In general, we do not need to classify user intentions, but only need to take the user input information as the answer to the chatbot’s questions. Here, we use Keras deep learning library to build the classification model.

Chatbot intentions and patterns to learn are defined in a simple variable. There is no need for a corpus of T. We knew that people would laugh at bots if they didn’t have a corpus, but our goal was to build a specific chatbot for a specific context. So the classification model is created as a small vocabulary that will only be able to identify a small set of patterns provided for training.

Namely, the so-called machine learning, is taught machines do you repeat one or a few pieces of the right thing, in training, you demonstrate how to do it is right, then the machine in study is expected to extrapolate, but this time we don’t teach it a lot of things, only one, is used to test its response, isn’t it a bit like you train your dog in the house? Except dogs can’t talk to you.

Here’s a quick example of an intentional data variable, which you can expand infinitely with a corpus if you like:

The intents = {" of "the intents: [{" tag" : "hello", "patterns" : [" hello ", "hello", "excuse me", "someone", "teacher", "embarrassed", "beauty", "handsome boy", "jing younger sister", "hi"], "responses" : [" hello ", "are you ah", "eat it within your", "do you have something]", "context" : ""}, {" tag" : "goodbye," "patterns" : [" goodbye ", "bye", "88" and "bye-bye", "see you later"], "responses" : [" goodbye ", "bon voyage", "see you next time," "goodbye you inside the"], "context" : ""},]}Copy the code

As you can see, I inserted two context tags, hello and goodbye, including user input and machine response data.

Before we can start the classification model training, we need to build vocabulary. The pattern is processed to create a vocabulary. Each word will have a stem to produce a common root, which will help to match more combinations of user input.

for intent in intents['intents']: for pattern in intent['patterns']: # tokenize each word in the sentence w = nltk.word_tokenize(pattern) # add to our words list words.extend(w) # add to documents in our corpus documents.append((w, intent['tag'])) # add to our classes list if intent['tag'] not in classes: classes.append(intent['tag']) words = [stemmer.stem(w.lower()) for w in words if w not in ignore_words] words = Sorted (list(set(words)) classes = sorted(list(set(classes)) print (len(classes), "context ", classes) print (len(words), "Number of words ", words)Copy the code

Output:

Context [2 'farewell', 'hello'] 14 words [' 88 ', 'sorry', 'hello,' goodbye ', 'see you', 'bye-bye', 'handsome boy', 'master,' hello, 'goodbye', 'someone?', 'beauty', 'excuse me', 'then']Copy the code

The training will not be based on words, because words don’t make any sense to the machine. This is the same mistake many Chinese word segmentation databases fall into. In fact, the machine doesn’t understand whether you are typing in English or Chinese. The array length will be equal to the vocabulary size, which is set to 1 when a word or term in the current schema is in a given position.

# create our training data training = [] # create an empty array for our output output_empty = [0] * len(classes) # training set, bag of words for each sentence for doc in documents: # initialize our bag of words bag = [] pattern_words = doc[0] pattern_words = [stemmer.stem(word.lower()) for word in pattern_words] for w in words: bag.append(1) if w in pattern_words else bag.append(0) output_row = list(output_empty) output_row[classes.index(doc[1])]  = 1 training.append([bag, output_row]) random.shuffle(training) training = np.array(training) train_x = list(training[:,0]) train_y = list(training[:,1])Copy the code

We started the data training, and the model was built with Keras, based on three layers. Because of the small data base, the classification output will be a multi-class array, which will help identify encoding intentions. Use Softmax activation to generate multi-class classification output (the result returns a 0/1 array: [1,0,0…,0]– this array recognizes encoding intent).

model = Sequential() model.add(Dense(128, input_shape=(len(train_x[0]),), The activation = 'relu) model. The add (Dropout (0.5)) model. The add (Dense (64, Activation ='relu') model.add(Dropout(0.5)) Model.Add (Dense(Len (train_y[0]), activation='softmax') SGD = SGD(LR =0.01, Decay = 1E-6, Momentum =0.9, nesterov=True) model.compile(Loss ='categorical_crossentropy', optimizer= SGD, decay= 1E-6, Momentum =0.9, nesterov=True) metrics=['accuracy']) model.fit(np.array(train_x), np.array(train_y), epochs=200, batch_size=5, verbose=1)Copy the code

The training was performed in 200 iterations with a batch capacity of 5, and since my test data sample was small, 100 would work, that’s not the point.

Start training:

14/14 [==============================] - 0s 32ms/step - loss: 0.7305 - acc: 0.5000  
Epoch 2/200  
14/14 [==============================] - 0s 391us/step - loss: 0.7458 - acc: 0.4286  
Epoch 3/200  
14/14 [==============================] - 0s 390us/step - loss: 0.7086 - acc: 0.3571  
Epoch 4/200  
14/14 [==============================] - 0s 395us/step - loss: 0.6941 - acc: 0.6429  
Epoch 5/200  
14/14 [==============================] - 0s 426us/step - loss: 0.6358 - acc: 0.7143  
Epoch 6/200  
14/14 [==============================] - 0s 356us/step - loss: 0.6287 - acc: 0.5714  
Epoch 7/200  
14/14 [==============================] - 0s 366us/step - loss: 0.6457 - acc: 0.6429  
Epoch 8/200  
14/14 [==============================] - 0s 899us/step - loss: 0.6336 - acc: 0.6429  
Epoch 9/200  
14/14 [==============================] - 0s 464us/step - loss: 0.5815 - acc: 0.6429  
Epoch 10/200  
14/14 [==============================] - 0s 408us/step - loss: 0.5895 - acc: 0.6429  
Epoch 11/200  
14/14 [==============================] - 0s 548us/step - loss: 0.6050 - acc: 0.6429  
Epoch 12/200  
14/14 [==============================] - 0s 468us/step - loss: 0.6254 - acc: 0.6429  
Epoch 13/200  
14/14 [==============================] - 0s 388us/step - loss: 0.4990 - acc: 0.7857  
Epoch 14/200  
14/14 [==============================] - 0s 392us/step - loss: 0.5880 - acc: 0.7143  
Epoch 15/200  
14/14 [==============================] - 0s 370us/step - loss: 0.5118 - acc: 0.8571  
Epoch 16/200  
14/14 [==============================] - 0s 457us/step - loss: 0.5579 - acc: 0.7143  
Epoch 17/200  
14/14 [==============================] - 0s 432us/step - loss: 0.4535 - acc: 0.7857  
Epoch 18/200  
14/14 [==============================] - 0s 357us/step - loss: 0.4367 - acc: 0.7857  
Epoch 19/200  
14/14 [==============================] - 0s 384us/step - loss: 0.4751 - acc: 0.7857  
Epoch 20/200  
14/14 [==============================] - 0s 346us/step - loss: 0.4404 - acc: 0.9286  
Epoch 21/200  
14/14 [==============================] - 0s 500us/step - loss: 0.4325 - acc: 0.8571  
Epoch 22/200  
14/14 [==============================] - 0s 400us/step - loss: 0.4104 - acc: 0.9286  
Epoch 23/200  
14/14 [==============================] - 0s 738us/step - loss: 0.4296 - acc: 0.7857  
Epoch 24/200  
14/14 [==============================] - 0s 387us/step - loss: 0.3706 - acc: 0.9286  
Epoch 25/200  
14/14 [==============================] - 0s 430us/step - loss: 0.4213 - acc: 0.8571  
Epoch 26/200  
14/14 [==============================] - 0s 351us/step - loss: 0.2867 - acc: 1.0000  
Epoch 27/200  
14/14 [==============================] - 0s 3ms/step - loss: 0.2903 - acc: 1.0000  
Epoch 28/200  
14/14 [==============================] - 0s 366us/step - loss: 0.3010 - acc: 0.9286  
Epoch 29/200  
14/14 [==============================] - 0s 404us/step - loss: 0.2466 - acc: 0.9286  
Epoch 30/200  
14/14 [==============================] - 0s 428us/step - loss: 0.3035 - acc: 0.7857  
Epoch 31/200  
14/14 [==============================] - 0s 407us/step - loss: 0.2075 - acc: 1.0000  
Epoch 32/200  
14/14 [==============================] - 0s 457us/step - loss: 0.2167 - acc: 0.9286  
Epoch 33/200  
14/14 [==============================] - 0s 613us/step - loss: 0.1266 - acc: 1.0000  
Epoch 34/200  
14/14 [==============================] - 0s 534us/step - loss: 0.2906 - acc: 0.9286  
Epoch 35/200  
14/14 [==============================] - 0s 463us/step - loss: 0.2560 - acc: 0.9286  
Epoch 36/200  
14/14 [==============================] - 0s 500us/step - loss: 0.1686 - acc: 1.0000  
Epoch 37/200  
14/14 [==============================] - 0s 387us/step - loss: 0.0922 - acc: 1.0000  
Epoch 38/200  
14/14 [==============================] - 0s 430us/step - loss: 0.1620 - acc: 1.0000  
Epoch 39/200  
14/14 [==============================] - 0s 371us/step - loss: 0.1104 - acc: 1.0000  
Epoch 40/200  
14/14 [==============================] - 0s 488us/step - loss: 0.1330 - acc: 1.0000  
Epoch 41/200  
14/14 [==============================] - 0s 381us/step - loss: 0.1322 - acc: 1.0000  
Epoch 42/200  
14/14 [==============================] - 0s 462us/step - loss: 0.0575 - acc: 1.0000  
Epoch 43/200  
14/14 [==============================] - 0s 1ms/step - loss: 0.1137 - acc: 1.0000  
Epoch 44/200  
14/14 [==============================] - 0s 450us/step - loss: 0.0245 - acc: 1.0000  
Epoch 45/200  
14/14 [==============================] - 0s 470us/step - loss: 0.1824 - acc: 1.0000  
Epoch 46/200  
14/14 [==============================] - 0s 444us/step - loss: 0.0822 - acc: 1.0000  
Epoch 47/200  
14/14 [==============================] - 0s 436us/step - loss: 0.0939 - acc: 1.0000  
Epoch 48/200  
14/14 [==============================] - 0s 396us/step - loss: 0.0288 - acc: 1.0000  
Epoch 49/200  
14/14 [==============================] - 0s 580us/step - loss: 0.1367 - acc: 0.9286  
Epoch 50/200  
14/14 [==============================] - 0s 351us/step - loss: 0.0363 - acc: 1.0000  
Epoch 51/200  
14/14 [==============================] - 0s 379us/step - loss: 0.0272 - acc: 1.0000  
Epoch 52/200  
14/14 [==============================] - 0s 358us/step - loss: 0.0712 - acc: 1.0000  
Epoch 53/200  
14/14 [==============================] - 0s 4ms/step - loss: 0.0426 - acc: 1.0000  
Epoch 54/200  
14/14 [==============================] - 0s 370us/step - loss: 0.0430 - acc: 1.0000  
Epoch 55/200  
14/14 [==============================] - 0s 368us/step - loss: 0.0292 - acc: 1.0000  
Epoch 56/200  
14/14 [==============================] - 0s 494us/step - loss: 0.0777 - acc: 1.0000  
Epoch 57/200  
14/14 [==============================] - 0s 356us/step - loss: 0.0496 - acc: 1.0000  
Epoch 58/200  
14/14 [==============================] - 0s 427us/step - loss: 0.1485 - acc: 1.0000  
Epoch 59/200  
14/14 [==============================] - 0s 381us/step - loss: 0.1006 - acc: 1.0000  
Epoch 60/200  
14/14 [==============================] - 0s 421us/step - loss: 0.0183 - acc: 1.0000  
Epoch 61/200  
14/14 [==============================] - 0s 344us/step - loss: 0.0788 - acc: 0.9286  
Epoch 62/200  
14/14 [==============================] - 0s 529us/step - loss: 0.0176 - acc: 1.0000
Copy the code

Ok, 200 times later, now that the model has been trained, declare a method for bag conversion:

def clean_up_sentence(sentence):  
    # tokenize the pattern - split words into array  
    sentence_words = nltk.word_tokenize(sentence)  
    # stem each word - create short form for word  
    sentence_words = [stemmer.stem(word.lower()) for word in sentence_words]  
    return sentence_words

def bow(sentence, words, show_details=True):  
    # tokenize the pattern  
    sentence_words = clean_up_sentence(sentence)  
    # bag of words - matrix of N words, vocabulary matrix  
    bag = [0]*len(words)    
    for s in sentence_words:  
        for i,w in enumerate(words):  
            if w == s:   
                # assign 1 if current word is in the vocabulary position  
                bag[i] = 1  
                if show_details:  
                    print ("found in bag: %s" % w)  
    return(np.array(bag))
Copy the code

Test to see if you can hit the word bag:

P = bow(" hello ", words) print (p)Copy the code

The return value:

Found in bag: Hello [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0Copy the code

It’s a clear match. Word in the bag.

Before we package the model, we can use the model.predict function to categorize user inputs and return user intentions based on the calculated probability (multiple intentions can be returned, output in reverse order based on probability) :

def classify_local(sentence): ERROR_THRESHOLD = 0.25 # generate probabilities from the model input_data = pd.dataframe ([sentence, words]), dtype=float, index=['input']) results = model.predict([input_data])[0] # filter out predictions below a threshold, and provide intent index results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD] # sort by strength of probability results.sort(key=lambda x: x[1], reverse=True) return_list = [] for r in results: return_list.append((classes[r[0]], str(r[1]))) # return tuple of intent and probability return return_listCopy the code

Test it out:

Print (classify_local (' hello '))Copy the code

The return value:

[(' hello ', '0.999913')] LiuYUE: Tornado liuyue$Copy the code

Measured again:

print(classify_local('88'))
Copy the code

The return value:

Found in bag: 88 [(' goodbye ', '0.9995449')]Copy the code

Perfect. Matches the context tag of the greeting, and if you want, you can test a few more to refine the model.

After the test is complete, we can package the trained model so that no training is required before each call:

json_file = model.to_json()  
with open('v3ucn.json', "w") as file:  
   file.write(json_file)  
  
model.save_weights('./v3ucn.h5f')
Copy the code

Here the model is divided into data files (JSON) and weight files (H5F), save them and use them later.

Next, we will build a chatbot API. Here we will use the popular Fastapi framework, put the model file into the project directory, and write main.py:

import random import uvicorn from fastapi import FastAPI app = FastAPI() def classify_local(sentence): ERROR_THRESHOLD = 0.25 # generate probabilities from the model input_data = pd.dataframe ([sentence, words]), dtype=float, index=['input']) results = model.predict([input_data])[0] # filter out predictions below a threshold, and provide intent index results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD] # sort by strength of probability results.sort(key=lambda x: x[1], reverse=True) return_list = [] for r in results: return_list.append((classes[r[0]], str(r[1]))) # return tuple of intent and probability return return_list @app.get('/') async def root(word: str = None): from keras.models import model_from_json # # load json and create model file = open("./v3ucn.json", 'r') model_json = file.read() file.close() model = model_from_json(model_json) model.load_weights("./v3ucn.h5f") wordlist = classify_local(word) a = "" for intent in intents['intents']: if intent['tag'] == wordlist[0][0]: a = random.choice(intent['responses']) return {'message':a} if __name__ == "__main__": Uvicorn. Run (app, host = "127.0.0.1," port = 8000)Copy the code

Here:

from keras.models import model_from_json  
file = open("./v3ucn.json", 'r')  
model_json = file.read()  
file.close()  
model = model_from_json(model_json)  
model.load_weights("./v3ucn.h5f")
Copy the code

Import the trained model library and start the service:

uvicorn main:app --reload
Copy the code

It works like this:

Conclusion: There is no doubt that science and technology change life, chat robot can let us without the companions of beauty, also can listen to the singing and singing of birds, I believe that in the near future, smiling, wearing beautiful hair, “Mechanical” can also accompany me equal to the wind under the moon.

The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_178