When I was thinking about this problem, I could not help recalling the IBM robot Watson, which I thought was very magical when I was a child. This robot possessed the top-level intelligence of machines made by human beings at that time, with advanced language processing ability and the ability to understand English initially.
Preliminary view, implementing such an ability and language in communication with the human robot, including speech recognition and natural language processing (including the sign language and lip language, body language, etc.) to communicate with humans, through natural language generation and speech synthesis and human communication, but also the need for information retrieval and information extraction, thus capable of reasoning, according to the known To come to a conclusion.
Some difficulties in language processing:
We call processing computing technology of spoken and written language, speech and language processing together even natural language processing, actually this is a very broad definition of the scope from a known such as word segmentation, word wrap this relatively simple technology, small ice until such as Microsoft such automatic answer, Google translate spoken such real-time automatic translation Advanced technology.
In contrast to other artificial intelligence applications such as computer vision, natural language processing requires practitioners to have a certain knowledge of language, just like when we start to use the NLTK package to calculate the number of words, sentences, and contextual statements in a text file. When dealing with a number of bytes if with me on the function, this is a simple data processing tools, and if we want to calculate the number of words in an article, the number of words, you need to let the computer know what is the word, what is the sentence, from where the pausing, language knowledge of where to start, when the tool becomes a natural language processing system. But the tools, like me, after all, is a simple system, his knowledge of language is limited, if you want to with their ability to have dialogue with our human language, we must request system has more extensive and more profound knowledge of the language, so it requires practitioners have to deal with more complex language required by the system knowledge scope and types of language ability .
The computer voice recognition, computer analysis must also he can accept the voice signal, noise, are those who are useful knowledge, and therefore, in order to generate the answer to the feedback, the computer must be answered in the knowledge map is organized into a series of words, and can let the human be able to generate is the recognition of speech signal.
Of course, in order to achieve this, we can use the knowledge of Phonetics and phonology, which can help us build models to recognize the sounds in statements.
If it is to deal with words, there are a lot of problems, such as dealing with the acronyms such as I’m and I am. If you want to be able to produce and identify such and such variants of words, in fact, it needs morphological knowledge, hoping to reflect the relevant information about the form and behavior of words in the context.
In addition, in addition to processing individual words, we also need to think about how to generate a sentence and be able to generate it according to our needs, so that we have the knowledge of grouping words into sentences, and the knowledge of lexical semantics and combinatorial semantics
The last one is the most difficult, about disambiguation:
For example, “This room is an oven” and “this room is a box” are the same sentences but not literal, the former means the room is hot, the latter means the room is small. It takes more than just word representation or even parsing to make a computer understand what these words mean. For example, language understanding is actually a multimodal process, which requires the integration of multi-modal contextual information such as visual, auditory and even tactile information in addition to language to achieve true language understanding. Personally, I think this is where natural language comprehension/processing is more difficult.
Natural language understanding/processing should be the key challenge in achieving general artificial intelligence, but it does not seem to be the most challenging. The ability to think, make decisions, and create on the basis of language, which is the embodiment of human intelligence, seems to have been left out of the main discussion of ARTIFICIAL intelligence, and may be more difficult. How, for example, can machines think philosophically like humans, conduct wars or run businesses like humans, or invent like humans? These problems all seem more difficult than natural language comprehension/processing.
Even if it is limited to natural language processing, current concerns are more focused on the instrumental nature of natural language messaging, namely how to make computers understand the literal meaning of a sentence more accurately. The more fascinating feature of human language, which is “full of words and full of meaning”, is far from being explored.