The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_178
We’re all familiar with the concept of ChatRobot. Maybe you’ve flirted with Siri in your boredom, or laughed with xiao Ai in your spare time. In any case, we have to admit that artificial intelligence has penetrated our lives. At present, there are many robots on the market that provide three-party API: Microsoft Xiaoice, Turing Robot, Tencent Chat, Qingyunke robot and so on. As long as we want, we can access it on the APP end or web application at any time. But how do these applications work at the bottom? Without Internet access, can we communicate with humans using only a locally stored mind-sphere, as depicted in the TV series Westworld? If you’re not content to be more than a switchman, please follow our journey, This time, we will use the deep learning library Keras/TensorFlow for the first time to build our own local chatbot, independent of any three-party interface and network.
Install the dependencies first:
pip3 install Tensorflow
pip3 install Keras
pip3 install nltk
Copy the code
Then write the script test_bot.py to import the required library:
import nltk
import ssl
from nltk.stem.lancaster import LancasterStemmer
stemmer = LancasterStemmer()
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.optimizers import SGD
import pandas as pd
import pickle
import random
Copy the code
NLTK, the natural language analysis library, will report an error:
Resource punkt not found
Copy the code
Normally, just add a line of the downloader code
import nltk
nltk.download('punkt')
Copy the code
However, due to academic Internet access, it is difficult to download it normally through the Python loader, so we play a curve to save the country and manually download the compressed package:
https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/tokenizers/punkt.zip
Copy the code
Unzip it and place it in your user directory:
C:\Users\liuyue\tokenizers\nltk_data\punkt
Copy the code
Ok, without further discussion, one of the main challenges in developing chatbots is sorting user input and being able to recognize the correct human intentions (this can be solved with machine learning, but it’s too complicated, so I got lazy, so I used deep learning Keras). The second is how to maintain the context, that is, to analyze and track the context. In general, we do not need to classify user intentions, but only need to take the user input information as the answer to the chatbot’s questions. Here, we use Keras deep learning library to build the classification model.
Chatbot intentions and patterns to learn are defined in a simple variable. There is no need for a corpus of T. We knew that people would laugh at bots if they didn’t have a corpus, but our goal was to build a specific chatbot for a specific context. So the classification model is created as a small vocabulary that will only be able to identify a small set of patterns provided for training.
Namely, the so-called machine learning, is taught machines do you repeat one or a few pieces of the right thing, in training, you demonstrate how to do it is right, then the machine in study is expected to extrapolate, but this time we don’t teach it a lot of things, only one, is used to test its response, isn’t it a bit like you train your dog in the house? Except dogs can’t talk to you.
Here’s a quick example of an intentional data variable, which you can expand infinitely with a corpus if you like:
The intents = {" of "the intents: [{" tag" : "hello", "patterns" : [" hello ", "hello", "excuse me", "someone", "teacher", "embarrassed", "beauty", "handsome boy", "jing younger sister", "hi"], "responses" : [" hello ", "are you ah", "eat it within your", "do you have something]", "context" : ""}, {" tag" : "goodbye," "patterns" : [" goodbye ", "bye", "88" and "bye-bye", "see you later"], "responses" : [" goodbye ", "bon voyage", "see you next time," "goodbye you inside the"], "context" : ""},]}Copy the code
As you can see, I inserted two context tags, hello and goodbye, including user input and machine response data.
Before we can start the classification model training, we need to build vocabulary. The pattern is processed to create a vocabulary. Each word will have a stem to produce a common root, which will help to match more combinations of user input.
for intent in intents['intents']: for pattern in intent['patterns']: # tokenize each word in the sentence w = nltk.word_tokenize(pattern) # add to our words list words.extend(w) # add to documents in our corpus documents.append((w, intent['tag'])) # add to our classes list if intent['tag'] not in classes: classes.append(intent['tag']) words = [stemmer.stem(w.lower()) for w in words if w not in ignore_words] words = Sorted (list(set(words)) classes = sorted(list(set(classes)) print (len(classes), "context ", classes) print (len(words), "Number of words ", words)Copy the code
Output:
Context [2 'farewell', 'hello'] 14 words [' 88 ', 'sorry', 'hello,' goodbye ', 'see you', 'bye-bye', 'handsome boy', 'master,' hello, 'goodbye', 'someone?', 'beauty', 'excuse me', 'then']Copy the code
The training will not be based on words, because words don’t make any sense to the machine. This is the same mistake many Chinese word segmentation databases fall into. In fact, the machine doesn’t understand whether you are typing in English or Chinese. The array length will be equal to the vocabulary size, which is set to 1 when a word or term in the current schema is in a given position.
# create our training data training = [] # create an empty array for our output output_empty = [0] * len(classes) # training set, bag of words for each sentence for doc in documents: # initialize our bag of words bag = [] pattern_words = doc[0] pattern_words = [stemmer.stem(word.lower()) for word in pattern_words] for w in words: bag.append(1) if w in pattern_words else bag.append(0) output_row = list(output_empty) output_row[classes.index(doc[1])] = 1 training.append([bag, output_row]) random.shuffle(training) training = np.array(training) train_x = list(training[:,0]) train_y = list(training[:,1])Copy the code
We started the data training, and the model was built with Keras, based on three layers. Because of the small data base, the classification output will be a multi-class array, which will help identify encoding intentions. Use Softmax activation to generate multi-class classification output (the result returns a 0/1 array: [1,0,0…,0]– this array recognizes encoding intent).
model = Sequential() model.add(Dense(128, input_shape=(len(train_x[0]),), The activation = 'relu) model. The add (Dropout (0.5)) model. The add (Dense (64, Activation ='relu') model.add(Dropout(0.5)) Model.Add (Dense(Len (train_y[0]), activation='softmax') SGD = SGD(LR =0.01, Decay = 1E-6, Momentum =0.9, nesterov=True) model.compile(Loss ='categorical_crossentropy', optimizer= SGD, decay= 1E-6, Momentum =0.9, nesterov=True) metrics=['accuracy']) model.fit(np.array(train_x), np.array(train_y), epochs=200, batch_size=5, verbose=1)Copy the code
The training was performed in 200 iterations with a batch capacity of 5, and since my test data sample was small, 100 would work, that’s not the point.
Start training:
14/14 [==============================] - 0s 32ms/step - loss: 0.7305 - acc: 0.5000
Epoch 2/200
14/14 [==============================] - 0s 391us/step - loss: 0.7458 - acc: 0.4286
Epoch 3/200
14/14 [==============================] - 0s 390us/step - loss: 0.7086 - acc: 0.3571
Epoch 4/200
14/14 [==============================] - 0s 395us/step - loss: 0.6941 - acc: 0.6429
Epoch 5/200
14/14 [==============================] - 0s 426us/step - loss: 0.6358 - acc: 0.7143
Epoch 6/200
14/14 [==============================] - 0s 356us/step - loss: 0.6287 - acc: 0.5714
Epoch 7/200
14/14 [==============================] - 0s 366us/step - loss: 0.6457 - acc: 0.6429
Epoch 8/200
14/14 [==============================] - 0s 899us/step - loss: 0.6336 - acc: 0.6429
Epoch 9/200
14/14 [==============================] - 0s 464us/step - loss: 0.5815 - acc: 0.6429
Epoch 10/200
14/14 [==============================] - 0s 408us/step - loss: 0.5895 - acc: 0.6429
Epoch 11/200
14/14 [==============================] - 0s 548us/step - loss: 0.6050 - acc: 0.6429
Epoch 12/200
14/14 [==============================] - 0s 468us/step - loss: 0.6254 - acc: 0.6429
Epoch 13/200
14/14 [==============================] - 0s 388us/step - loss: 0.4990 - acc: 0.7857
Epoch 14/200
14/14 [==============================] - 0s 392us/step - loss: 0.5880 - acc: 0.7143
Epoch 15/200
14/14 [==============================] - 0s 370us/step - loss: 0.5118 - acc: 0.8571
Epoch 16/200
14/14 [==============================] - 0s 457us/step - loss: 0.5579 - acc: 0.7143
Epoch 17/200
14/14 [==============================] - 0s 432us/step - loss: 0.4535 - acc: 0.7857
Epoch 18/200
14/14 [==============================] - 0s 357us/step - loss: 0.4367 - acc: 0.7857
Epoch 19/200
14/14 [==============================] - 0s 384us/step - loss: 0.4751 - acc: 0.7857
Epoch 20/200
14/14 [==============================] - 0s 346us/step - loss: 0.4404 - acc: 0.9286
Epoch 21/200
14/14 [==============================] - 0s 500us/step - loss: 0.4325 - acc: 0.8571
Epoch 22/200
14/14 [==============================] - 0s 400us/step - loss: 0.4104 - acc: 0.9286
Epoch 23/200
14/14 [==============================] - 0s 738us/step - loss: 0.4296 - acc: 0.7857
Epoch 24/200
14/14 [==============================] - 0s 387us/step - loss: 0.3706 - acc: 0.9286
Epoch 25/200
14/14 [==============================] - 0s 430us/step - loss: 0.4213 - acc: 0.8571
Epoch 26/200
14/14 [==============================] - 0s 351us/step - loss: 0.2867 - acc: 1.0000
Epoch 27/200
14/14 [==============================] - 0s 3ms/step - loss: 0.2903 - acc: 1.0000
Epoch 28/200
14/14 [==============================] - 0s 366us/step - loss: 0.3010 - acc: 0.9286
Epoch 29/200
14/14 [==============================] - 0s 404us/step - loss: 0.2466 - acc: 0.9286
Epoch 30/200
14/14 [==============================] - 0s 428us/step - loss: 0.3035 - acc: 0.7857
Epoch 31/200
14/14 [==============================] - 0s 407us/step - loss: 0.2075 - acc: 1.0000
Epoch 32/200
14/14 [==============================] - 0s 457us/step - loss: 0.2167 - acc: 0.9286
Epoch 33/200
14/14 [==============================] - 0s 613us/step - loss: 0.1266 - acc: 1.0000
Epoch 34/200
14/14 [==============================] - 0s 534us/step - loss: 0.2906 - acc: 0.9286
Epoch 35/200
14/14 [==============================] - 0s 463us/step - loss: 0.2560 - acc: 0.9286
Epoch 36/200
14/14 [==============================] - 0s 500us/step - loss: 0.1686 - acc: 1.0000
Epoch 37/200
14/14 [==============================] - 0s 387us/step - loss: 0.0922 - acc: 1.0000
Epoch 38/200
14/14 [==============================] - 0s 430us/step - loss: 0.1620 - acc: 1.0000
Epoch 39/200
14/14 [==============================] - 0s 371us/step - loss: 0.1104 - acc: 1.0000
Epoch 40/200
14/14 [==============================] - 0s 488us/step - loss: 0.1330 - acc: 1.0000
Epoch 41/200
14/14 [==============================] - 0s 381us/step - loss: 0.1322 - acc: 1.0000
Epoch 42/200
14/14 [==============================] - 0s 462us/step - loss: 0.0575 - acc: 1.0000
Epoch 43/200
14/14 [==============================] - 0s 1ms/step - loss: 0.1137 - acc: 1.0000
Epoch 44/200
14/14 [==============================] - 0s 450us/step - loss: 0.0245 - acc: 1.0000
Epoch 45/200
14/14 [==============================] - 0s 470us/step - loss: 0.1824 - acc: 1.0000
Epoch 46/200
14/14 [==============================] - 0s 444us/step - loss: 0.0822 - acc: 1.0000
Epoch 47/200
14/14 [==============================] - 0s 436us/step - loss: 0.0939 - acc: 1.0000
Epoch 48/200
14/14 [==============================] - 0s 396us/step - loss: 0.0288 - acc: 1.0000
Epoch 49/200
14/14 [==============================] - 0s 580us/step - loss: 0.1367 - acc: 0.9286
Epoch 50/200
14/14 [==============================] - 0s 351us/step - loss: 0.0363 - acc: 1.0000
Epoch 51/200
14/14 [==============================] - 0s 379us/step - loss: 0.0272 - acc: 1.0000
Epoch 52/200
14/14 [==============================] - 0s 358us/step - loss: 0.0712 - acc: 1.0000
Epoch 53/200
14/14 [==============================] - 0s 4ms/step - loss: 0.0426 - acc: 1.0000
Epoch 54/200
14/14 [==============================] - 0s 370us/step - loss: 0.0430 - acc: 1.0000
Epoch 55/200
14/14 [==============================] - 0s 368us/step - loss: 0.0292 - acc: 1.0000
Epoch 56/200
14/14 [==============================] - 0s 494us/step - loss: 0.0777 - acc: 1.0000
Epoch 57/200
14/14 [==============================] - 0s 356us/step - loss: 0.0496 - acc: 1.0000
Epoch 58/200
14/14 [==============================] - 0s 427us/step - loss: 0.1485 - acc: 1.0000
Epoch 59/200
14/14 [==============================] - 0s 381us/step - loss: 0.1006 - acc: 1.0000
Epoch 60/200
14/14 [==============================] - 0s 421us/step - loss: 0.0183 - acc: 1.0000
Epoch 61/200
14/14 [==============================] - 0s 344us/step - loss: 0.0788 - acc: 0.9286
Epoch 62/200
14/14 [==============================] - 0s 529us/step - loss: 0.0176 - acc: 1.0000
Copy the code
Ok, 200 times later, now that the model has been trained, declare a method for bag conversion:
def clean_up_sentence(sentence):
# tokenize the pattern - split words into array
sentence_words = nltk.word_tokenize(sentence)
# stem each word - create short form for word
sentence_words = [stemmer.stem(word.lower()) for word in sentence_words]
return sentence_words
def bow(sentence, words, show_details=True):
# tokenize the pattern
sentence_words = clean_up_sentence(sentence)
# bag of words - matrix of N words, vocabulary matrix
bag = [0]*len(words)
for s in sentence_words:
for i,w in enumerate(words):
if w == s:
# assign 1 if current word is in the vocabulary position
bag[i] = 1
if show_details:
print ("found in bag: %s" % w)
return(np.array(bag))
Copy the code
Test to see if you can hit the word bag:
P = bow(" hello ", words) print (p)Copy the code
The return value:
Found in bag: Hello [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0Copy the code
It’s a clear match. Word in the bag.
Before we package the model, we can use the model.predict function to categorize user inputs and return user intentions based on the calculated probability (multiple intentions can be returned, output in reverse order based on probability) :
def classify_local(sentence): ERROR_THRESHOLD = 0.25 # generate probabilities from the model input_data = pd.dataframe ([sentence, words]), dtype=float, index=['input']) results = model.predict([input_data])[0] # filter out predictions below a threshold, and provide intent index results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD] # sort by strength of probability results.sort(key=lambda x: x[1], reverse=True) return_list = [] for r in results: return_list.append((classes[r[0]], str(r[1]))) # return tuple of intent and probability return return_listCopy the code
Test it out:
Print (classify_local (' hello '))Copy the code
The return value:
[(' hello ', '0.999913')] LiuYUE: Tornado liuyue$Copy the code
Measured again:
print(classify_local('88'))
Copy the code
The return value:
Found in bag: 88 [(' goodbye ', '0.9995449')]Copy the code
Perfect. Matches the context tag of the greeting, and if you want, you can test a few more to refine the model.
After the test is complete, we can package the trained model so that no training is required before each call:
json_file = model.to_json()
with open('v3ucn.json', "w") as file:
file.write(json_file)
model.save_weights('./v3ucn.h5f')
Copy the code
Here the model is divided into data files (JSON) and weight files (H5F), save them and use them later.
Next, we will build a chatbot API. Here we will use the popular Fastapi framework, put the model file into the project directory, and write main.py:
import random import uvicorn from fastapi import FastAPI app = FastAPI() def classify_local(sentence): ERROR_THRESHOLD = 0.25 # generate probabilities from the model input_data = pd.dataframe ([sentence, words]), dtype=float, index=['input']) results = model.predict([input_data])[0] # filter out predictions below a threshold, and provide intent index results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD] # sort by strength of probability results.sort(key=lambda x: x[1], reverse=True) return_list = [] for r in results: return_list.append((classes[r[0]], str(r[1]))) # return tuple of intent and probability return return_list @app.get('/') async def root(word: str = None): from keras.models import model_from_json # # load json and create model file = open("./v3ucn.json", 'r') model_json = file.read() file.close() model = model_from_json(model_json) model.load_weights("./v3ucn.h5f") wordlist = classify_local(word) a = "" for intent in intents['intents']: if intent['tag'] == wordlist[0][0]: a = random.choice(intent['responses']) return {'message':a} if __name__ == "__main__": Uvicorn. Run (app, host = "127.0.0.1," port = 8000)Copy the code
Here:
from keras.models import model_from_json
file = open("./v3ucn.json", 'r')
model_json = file.read()
file.close()
model = model_from_json(model_json)
model.load_weights("./v3ucn.h5f")
Copy the code
Import the trained model library and start the service:
uvicorn main:app --reload
Copy the code
It works like this:
Conclusion: There is no doubt that science and technology change life, chat robot can let us without the companions of beauty, also can listen to the singing and singing of birds, I believe that in the near future, smiling, wearing beautiful hair, “Mechanical” can also accompany me equal to the wind under the moon.
The original article is reprinted from liu Yue’s Technology blog v3u.cn/a_id_178