With the popularization of the Internet, more and more users have been used to Posting comments on the Internet. These comments express the emotional leanings of users, which is valuable information for both companies and individuals.

However, in the face of massive user review data, it is almost impossible to perform sentiment analysis in a purely artificial way. Therefore, we need to use NLP technology to handle this task.

Emotional analysis: To judge the emotional color (praise or criticism) of the text containing subjective information, so as to determine the point of view, preference and emotional tendency of the text.

There are three main elements of sentiment analysis, as shown in the figure below:

  • Opinion holders: refer to users who post comments, such as users of Dianping
  • Opinion/polarity: Positive and negative comments, such as: yummy, ugly
  • Evaluation objects: generally including entities and attributes, such as the dining environment of KFC

Among them, a lot of sentiment analysis will also take the dimension of “time” into consideration and conduct analysis according to different time periods, so as to assess users’ emotional tendencies more objectively. For example, sentiment analysis on Twitter can be used to predict stock market prices.

At present, there are two main methods of sentiment analysis:

  • Emotion-based dictionary refers to extracting emotion-based words from the text to be analyzed according to the constructed emotion-based dictionary, and then calculating their emotion-based tendencies.
  • Emotion analysis based on machine learning refers to selecting emotion words as feature words, vectorizing the text to be analyzed, and classifying it with classifiers (LR, SVM, etc.).

In this paper, we focus on emotion analysis based on emotion dictionary, because its method is relatively simpler and easier to get started. It does not need to master relevant knowledge of machine learning or large amount of annotated training data, and its effect depends on the perfection of the emotion dictionary.

Emotional dictionary

Emotion dictionaries generally include emotion words and degree words. Users use emotional words to express their attitudes, such as: like, hate, etc.; Users use degree words to express the degree of strength, such as: very, general and so on.

There are some open source Chinese sentiment dictionaries available online for direct use, but the quality is generally spotty. If you want a high-quality, targeted Chinese emotion dictionary, you need to do it yourself, which is naturally inseparable from Chinese word segmentation technology. Here we have a valuable booklet “In-depth Understanding of NLP Chinese Word segmentation: From principle to Practice”, so that you can master Chinese word segmentation technology from scratch.

Suppose we already have a Chinese sentiment dictionary that looks something like this:

Emotional words The numerical
like 1
love 2
disadvantages – 1
disappointed 2 –

Among them, emotion words with positive value express positive, whereas those with negative value express negative. Different emotional words convey the degree is also different, such as: like is light love, love is deep love.

The degree dictionary is as follows:

The degree of word The numerical
A little bit 1
very 2
extremely 3

Users will use adjectives to further convey their emotional preferences, and we can also introduce symbols to augment degree words, such as exclamation marks (!). .

Now that we’re ready to process the text for analysis, we’ve collected a list of hotel user reviews and listed them as follows:

  • The floor attendants of the front desk are good, the room is quiet and tidy, and the floor drain design of the only toilet is not good, which leads to a small amount of water.
  • Nice location downtown, clean room as always.
  • No heat in the middle of the night! ! ! ! ! ! ! ! ! !

Among them, such as “no” and “no” are some inverted words, we set such words as -1, can change the original emotional word value symbol.

If we divide these sentences into words, we will get the following result:

1. The receptionist on the floor is very good. The room is quiet and tidy. Nice location Downtown room as always clean 3. No heat in the middle of the night! ! ! ! ! ! ! ! ! !Copy the code

According to the emotion dictionary, the words in the segmentation results can be matched, and the numerical calculation can be divided into positive and negative directions. Let’s take one of them as an example:

1. The receptionist on the floor is good, the room is quiet and tidy, the only bathroom floor drain is not well designed, resulting in a small amount of water. Positive matching result: Very: 1# degree wordGood: 1:1 clean: 1 forward numerical results: 1 + 1 + 1 + 1 = 4 negative matching results: bad: - 1 water: 1 negative numerical results: | | | | - 1 + - 1 = 2 emotional tendency end: (4, 2), integrated value for 4-2 = 2Copy the code

The above is the emotion analysis method based on the emotion dictionary. It is not difficult to find that Chinese word segmentation has a great influence on the final result. With this sentiment analysis method, we can automate, batch analyze user reviews, quickly get real feedback from users, and adjust product logic or personal purchase intention accordingly.

A brief introduction to emotion analysis based on machine learning:

  • First of all, good emotion classification data should be manually labeled, with positive labeling as 1 and negative labeling as 0, and divided into training data and test data.
  • Then, select the “feature words” in the text, such as commodity description words (simple, fashion) and so on, and convert the words into vectors to form the matrix of words.
  • Finally, classifier models (LR, SVM, NB) are used to learn the training data. After the model is obtained, the optimal model is selected with the test data for prediction.

This article discusses how to do emotion analysis, which basically meets everyone’s daily needs for emotion analysis. If you want to know some advanced methods, you can pay attention to my column, and the related content will be updated later. If you are interested in Chinese word segmentation, please support my gold digging booklet “In-depth Understanding of NLP Chinese word segmentation: From Principle to Practice”, thank you very much!