With the development of NLP technology, information flow and computing power are also increasing. We can now retrieve the exact information we need to complete the task by typing a few characters into the search bar. The first few autocomplete options offered by search are usually so appropriate that it feels like a person is helping us with the search.

What is driving the growth of NLP?

  • Is it a new understanding of the ever-expanding world of unstructured Web data?
  • Did the increase in processing power keep up with the researchers’ thinking?
  • Is it the increased efficiency of interacting with machines in human language?

All of the above, in fact, and more. You can type the question “Why is natural language processing so important now?” into any search engine. “And then you can find wikipedia articles that give all kinds of good reasons.

There are deeper reasons, too, one of which is the accelerated pursuit of AGI, or Deep AI. Human intelligence may be defined only by our ability to organize thoughts into discrete concepts that can be stored (memorized) and shared effectively. This allows us to expand our intelligence across time and space, connecting our brains to form collective intelligence.

One of Steven Pinker’s ideas in The Stuff of Thought is that we actually think in natural language. It’s not for nothing that this is called an “inner dialogue.” Facebook, Google and Elon Musk are betting on the fact that words will become the default communication protocol for thought. They have all invested in projects that try to convert thoughts, brain waves and electrical signals into words. Furthermore, the Whorf hypothesis holds that language affects the way we think. Natural language is undoubtedly a medium of culture and collective consciousness.

So natural language processing could be crucial if we want to mimic or simulate human thinking on machines. In addition, you will learn important clues about intelligence that can be hidden in the data structures and nesting relationships of words in Natural Language Processing. You’re going to use these structures, and neural networks allow inanimate systems to digest, store, retrieve and generate natural language in a human-like way.

And more importantly, why do you want to learn how to write a system that uses natural language? That’s because you might be able to save the world! Hopefully you’ve been paying attention to the discussions among the big guys about ai control issues and the challenges of developing “friendly AI.” Nick Bostrom, Calum Chace, Elon Musk, and many others believe that the future of humanity depends on our ability to develop friendly machines. For the foreseeable future, natural language will be an important link between humans and machines.

Even if we are able to “think” directly through machines, those thoughts are likely shaped by natural words and language in our brains. The lines between natural and machine languages will blur, just as the lines between humans and machines will disappear. In fact, the line began to blur in 1984, when the Publication of the Cyborg Manifesto made George Orwell’s dystopian predictions more possible and acceptable.

I hope the phrase “help save the world” didn’t confuse anyone. As the book progresses, we will show readers how to build and connect chatbot “brains”. In the process, readers will discover that even small disturbances in the social feedback loop between humans and machines can have profound effects on both machines and humans. Like a butterfly flapping its wings somewhere, a tiny tweak to a chatbot’s “selfish nature” could unleash a chaotic storm of conflicting behavior from rival chatbots. You’ll also notice that some kind and altruistic systems quickly gather a loyal following to help smooth out the chaos caused by short-sighted robots. Because of the network effects of prosocial behavior, prosocial cooperative chatbots can have a huge impact on the world.

 

That’s why the authors of Natural Language Processing in Action gathered together. Through open, honest, pro-social communication on the Internet using the language we were born with, a supportive community has formed. We are using collective intelligence to help build and support other semi-intelligent actors (machines). We hope that our words will stick in people’s minds and spread like memes across the chatbot world, infecting others with our enthusiasm for building pro-social NLP systems. The hope is that when superintelligence does eventually emerge, this pro-social spirit will give it a slight boost.

The natural language processing field uses Python to understand, analyze, and generate text

Hobson Lane, Cole Howard, Hannes Max Hapke, Translated by Shi Liang, Lu Xiao, Tang Kexin, Wang Bin

 

Content abstract

This book is an introduction to natural language processing (NLP) and deep learning. NLP has become the core application field of deep learning, and deep learning is a necessary tool in the research and application of NLP. The book is divided into three parts: The first part introduces the basic of NLP, including word segmentation, TF-IDF vectorization and transformation from word frequency vector to semantic vector; The second part describes deep learning, including neural network, word vector, convolutional neural network (CNN), recurrent neural network (RNN), long and short-term memory (LSTM) network, sequence to sequence modeling, attention mechanism and other basic deep learning models and methods. The third part introduces the practical content, including information extraction, question answering system, human-machine dialogue and other real world system model building, performance challenges and solutions. This book is aimed at middle – to advanced Python developers, combining basic theory with practical programming, and is a practical reference book for modern practitioners in the NLP field.

The editors recommend

  • Natural language processing is essential for Python developers
  • Practical reference guide for practitioners in the field of modern natural language processing
  • Translated by NLP team of Xiaomi AI Lab

1. This book is a practical guide to building machines that can read and interpret human language; 2. Readers can use existing Python packages to capture the meaning of text and respond accordingly; 3. This book extends traditional natural language processing methods to include neural networks, modern deep learning algorithms, and generative techniques for real-world problems such as extracting dates and names, synthesizing text, and answering open-ended questions; 4. Provide source code. Advances in deep learning allow applications to understand text and speech with extreme precision. As a result, chatbots can mimic real people to screen out resumes that are highly suited to the job, perform excellent predictive searches, and automatically generate summaries of documents — all at a fraction of the cost. New technology developments, along with the emergence of easy-to-use tools like Keras and TensorFlow, make professional-quality natural language processing (NLP) easier than ever. ● Use Keras, TensorFlow, Gensim, sciKit-learn and other tools. ● Rule-based natural language processing and data-based natural language processing. ● Extensible natural language processing pipeline. To read this book, you need a basic understanding of deep learning and intermediate Python programming skills.

Natural Language Processing In Action Features:

 

About the author

 

 

directory

Chapter 1 Overview of NLP 3 1.1 Natural and Programming languages 3 1.2 Magical Magic 4 1.2.1 Talking Machines 5 1.2.2 Mathematics in NLP 5 1.3 Practical Applications 7 1.4 Language in computer “eyes” 8 1.4.1 Lock language (regular Expressions) 9 1.4.2 Regular Expressions 9 1.4.3 A Simple Chatbot 11 1.4.4 Another method 14 1.5 Hyperspace Brief introduction 17 1.6 Word order and Syntax 19 1.7 Natural language pipeline for Chatbots 20 1.8 Deep processing 22 1.9 Natural Language IQ 24 1.10 Summary 26 Chapter 2 Building your own vocabulary — word segmentation 27 2.1 Challenges (Preview of stem restoration) 28 2.2 Building vocabulary with word segmentation 29 2.2.1 Dot product 37 2.2.2 Measuring the degree of overlap between word bags 37 2.2.3 Punctuation treatment 38 2.2.4 Extending vocabulary to N-gram 43 2.2.5 Normalization of vocabulary 48 2.3 Emotion 55 2.3.1 VADER: A rule-based emotion analyzer 56 2.3.2 Naive Bayes 58 2.4 Summary 61 Chapter 3 Mathematics in Words 62 3.1 Bags of Words 63 3.2 Vectorization 67 3.3 Zipf’s Law 74 3.4 Topic Modeling 76 3.4.1 Back to Zipf’s Law 79 3.4.2 Relevance ranking 80 3.4.3 Tools 82 3.4.4 Other Tools 83 3.4.5 Okapi BM25 85 3.4.6 Future Outlook 85 3.5 Summary 85 Semantics behind Word Frequency in Chapter 4 87 4.1 From Word Frequency to Topic score 88 4.1.1 TF-IDF vector and morphological merging 88 4.1.2 Topic vector 89 4.1.3 Thought experiment 90 4.1.4 A topic scoring algorithm 94 4.1.5 an LDA classifier 95 4.2 Latent semantic analysis 99 4.3 Singular value decomposition 103 4.3.1 Left singular vector U 105 4.3.2 Singular value vector S 106 4.3.3 Right Singular Vector VT 107 4.3.4 Direction of SVD matrix 107 4.3.5 Topic reduction 108 4.4 Principal Component Analysis 109 4.4.1 PCA on THREE-DIMENSIONAL vectors 111 4.4.2 Regression NLP 112 4.4.3 PCA based SMS semantic analysis 114 4.4.4 Truncated SVD based SMS semantic analysis 116 4.4.5 Effect of LSA-based SPAM SMS classification 117 4.5 Potential Dirichlet distribution (LDiA) 119 4.5.1 LDiA Ideas 120 4.5.2 Semantic analysis of Short Messages based on LDiA Topic Model 121 4.5.3 LDiA+LDA= Spam filter 124 4.5.4 More fair comparison: 32 LdiA topics 125 4.6 Distance and Similarity 127 4.7 Feedback and Improvement 129 4.8 The power of Topic Vectors 132 4.8.1 Semantic search 133 4.8.2 Improvement 135 4.9 Summary 135 Part II Deep Learning (Neural Networks) Chapter 5 Preliminary Neural networks (perceptron and back Propagation) 139 5.1 Composition of neural networks 140 5.1.1 Perceptron 140 5.1.2 Digital perceptron 141 5.1.3 Cognitive bias 142 5.1.4 Error surfaces 153 5.1.5 Different types of error surfaces 154 5.1.6 Multiple Gradient Descent Algorithms 155 5.1.7 Keras: Implementing Neural networks with Python 155 5.1.8 Outlook 158 5.1.9 Normalization: Formatting Input 159 5.2 Summary 159 Chapter 6 Word Vector Inference (Word2vec) 160 6.1 Semantic Query and analogy 160 6.2 Word Vector 162 6.2.1 Vector-oriented inference 165 6.2.2 How to calculate the Word2vec representation 167 6.2.3 How to use gensim. word2vec module 175 6.2.4 Generate customized word vector representation 177 6.2.5 Word2vec and GloVe 179 6.2.6 fastText 180 6.2.7 Word2vec and LSA 180 6.2.8 Visualization of Word relationships 181 6.2.9 Unnatural words 187 6.2.10 Computing Document Similarity with Doc2vec 188 6.3 Summary 190 Chapter 7 Convolutional Neural Networks (CNN) 191 7.1 Semantic Understanding 192 7.2 Toolkit 193 7.3 Convolutional Neural networks 194 7.3.1 Building blocks 195 7.3.2 Step size 196 7.3.3 Convolution kernel composition 196 7.3.4 Filling 198 7.3.5 learning 199 7.4 Narrow Windows 199 7.4.1 Keras Implementation: Preparing data 201 7.4.2 Convolutional neural network architecture 206 7.4.3 Pooling 206 7.4.4 Dropout 208 7.4.5 Output layer 209 7.4.6 Starting learning (training) 211 7.4.7 Using models in pipelining 212 7.4.8 Chapter 8 Recurrent Neural Networks (RNN) 215 8.1 Memory functions of Recurrent Networks 217 8.1.1 Back propagation algorithms over time 221 8.1.2 Weight updates at different times 223 8.1.3 A brief review 225 8.1.4 Difficulty 225 8.1.5 Implementing recurrent neural networks with Keras 226 8.2 Integrating various parts 230 8.3 Self-learning 231 8.4 Hyperparameter 232 8.5 Prediction 235 8.5.1 stateful 236 8.5.2 Bidirectional RNN 236 8.5.3 Chapter 9 Improving Memory: Long Short Term Memory Network (LSTM) 239 9.1 Long Short Term memory (LSTM) 240 9.1.1 Back propagation over time 247 9.1.2 Use of models 250 9.1.3 dirty data 251 9.1.4 Handling of “unknown” terms 254 9.1.5 character-level modeling 255 9.1.6 Generating chat text 260 9.1.7 further generating text 262 9.1.8 Problems with text generation: Content uncontrolled 269 9.1.9 Other memory mechanisms 269 9.1.10 Deeper Networks 270 9.2 Summary 271 Chapter 10 Sequence to sequence Modeling and attentional mechanisms 272 10.1 Encoding-decoding architecture 272 10.1.1 Decoding ideas 273 10.1.2 Deja vu? 275 10.1.3 Sequence to Sequence Dialogue 276 10.1.4 Reviewing LSTM 277 10.2 Assembling a Sequence to sequence pipeline 278 10.2.1 Preparing data sets for sequence to sequence training 278 10.2.2 Sequence to sequence models in Keras 279 10.2.3 Sequence encoder 280 10.2.4 Thought decoder 281 10.2.5 Assembling a Sequence to Sequence network 282 10.3 Training Sequence to Sequence Network 282 10.4 Building a chatbot using sequence to Sequence Network 284 10.4.1 Preparing a corpus for training 285 10.4.2 Building a character dictionary 286 10.4.3 Generating a single heat coding training set 286 10.4.4 Training sequence to sequence chatbot 287 10.4.5 Assembling sequence generation model 288 10.4.6 predicting output sequence 288 10.4.7 generating reply 289 10.4.8 Talking to a chatbot 290 10.5 Enhancement 290 10.5.1 Reducing training complexity using bucket loading 290 10.5.2 Attention mechanism 291 10.6 Practical application 292 10.7 Summary 294 Part 3 Chapter 11 Information Extraction (Named Entity Recognition and Question answering System) 297 11.1 Named Entities and Relationships 297 11.1.1 Knowledge base 298 11.1.2 Information extraction 300 11.2 Regular patterns 300 11.2.1 Regular expressions 301 11.2.2 Extracting information as a feature extracting task in machine learning 302 11.3 Information worth extracting 303 11.3.1 Extracting GPS position 303 11.3.2 Extracting date 304 11.4 Extracting person relationships 309 11.4.1 Part-of-speech tagging 309 11.4.2 Standardization of entity names 313 11.4.3 Standardization and extraction of entity relations 314 11.4.4 Word patterns 314 11.4.5 Text segmentation 314 11.4.6 Why split(‘. ! ‘) function does not work 316 11.4.7 Breaking sentences using regular expressions 316 11.5 Extracting real-world information 318 11.6 Summary 319 Chapter 12 Starting a chat (conversation engine) 320 12.1 Language skills 321 12.1.1 Modern methods 322 12.1.2 Hybrid methods 326 12.2 Pattern matching methods 327 12.2.1 AiMl-based pattern matching chatbot 328 12.2.2 Network view of pattern matching 334 12.3 Knowledge methods 334 12.4 Retrieval (search) methods 336 12.4.1 Context Challenge 336 12.4.2 Sample search based chat bot 338 12.4.3 Search based chat bot 341 12.5 Generative approach 343 12.5.1 Chat NLPIA 343 12.5.2 Pros and cons of each approach 345 12.6 Four wheel drive 345 12.7 Design Process 347 12.8 Techniques 349 12.8.1 Ask questions with predictable answers 349 12.8.2 Be interesting 350 12.8.3 When all else fails, Search 350 12.8.4 to become popular 350 12.8.5 to become connector 351 12.8.6 to become emotional 351 12.9 The Real world 351 12.10 Summary 352 Chapter 13 Scalability (Optimization, parallelization, and Batching) 353 13.1 Too much (data) isn’t necessarily a good thing 354 optimizing NLP algorithms 354 13.2.1 Indexing 354 13.2.2 Advanced indexing 357 13.2.4 Why use approximate indexing 361 13.2.5 at all Indexing workarounds: Discretization 362 13.3 Constant level memory algorithm 363 13.3.1 GenSIM 363 13.3.2 Figure computing 363 13.4 Parallelizing NLP computing 364 13.4.1 Training NLP models on gpu364 13.4.2 Rent and Buy 365 13.4.3 GPU Leasing 366 13.4.4 Tensor processing unit TPU 367 13.5 Reducing memory footprint during model training 367 13.6 Using TensorBoard to learn about the model 369 13.7 Summary 372 Appendix A NLP tools for this book 373 Appendix B Interesting Python and regular expressions 380 Appendix C Vectors and Matrices (basics of Linear algebra) 385 Appendix D Common tools and techniques for machine learning 391 Appendix E Setting up gpus 403 on Amazon Cloud Services (AWS) Appendix F Locally sensitive Hhash 415 Resources 421 Vocabulary 428