preface

Everyone gets along with a way of life, with women are not the same, different women to use different logic thinking, to consider the meaning behind different statements, life, not easy;

For love little white, the most worried about is the girlfriend is not happy, after all, it is not easy to become a real object from the right hand, is definitely double cherish;

But do you really understand the emotional analysis of every sentence? Give a sentence, can you know the emotional ratio of this sentence?

Sentiment analysis

The key to emotion analysis is the dictionary. I found it on the Internet. The Ontology library of Dalian Institute of Technology is famous.

Download address:

Link: https://pan.baidu.com/s/18PeWl-9EjZ7O5Rdfejzgig extraction code: qc3nCopy the code

After downloading it, look at the documentation to see that the format of the thesaurus is as follows:

It looks good, just give it a try

Decompression said, found there are several files:

Here are two py files:

  • Evaluate. Py, convert emotion.xlsx into emotion.csv
  • process.py

Note that evaluate.py may be used with UnicodeEncodeError;

Encoding (utF-8); encoding (utF-8);

CSV ', 'w',encoding=' UTF-8 ') as out_file:Copy the code

To implement evaluate.py, you need to import docopy and PANDAS.

Python is used to install docopy pandas. Python is used to install docaconda for pandas.Copy the code

docopt

Install two libraries, because docopy library is not familiar, so the official website to learn about:

Docopt’s website says:

Command-line interface description language
docopt helps you:
define interface for your command-line app, and
automatically generate parser for it.
Copy the code

Here you can see the two main functions of Docopt:

  • Defining interaction parameters
  • Parsing parameter information

Here’s another example from the official website:

Naval Fate.

Usage:
  naval_fate ship new <name>...
  naval_fate ship <name> move <x> <y> [--speed=<kn>]
  naval_fate ship shoot <x> <y>
  naval_fate mine (set|remove) <x> <y> [--moored|--drifting]
  naval_fate -h | --help
  naval_fate --version

Options:
  -h --help     Show this screen.
  --version     Show version.
  --speed=<kn>  Speed in knots [default: 10].
  --moored      Moored (anchored) mine.
  --drifting    Drifting mine.
Copy the code

In this example, Naval Fate is app name, naval_fate is command line command, ship, new, move are optional commands, x, y, name are positional arguments, -h, –help, –speed, etc. These are options;

In the example

  • “[]” Describes optional elements (optional)
  • “()” describes required elements
  • “|” to describe incompatible elements (mutually exclusive)
  • “…” Describing Repeating elements (repeating)

These parameters, preceded by naval_fate, form the available commands, which are described in Usage;

The Options section below Usage lists the Options and their descriptions, which specifically describe

  • Does the option have a long/short form, such as -h, –help
  • Whether the option is followed by a parameter, such as –speed=
  • Does the option have a default value, such as [default: 10]?

The Usage and options sections make up the help message, which is displayed on the command line when the user enters the -h or –help parameters.

Docopt extracts the help message and parses the parameters passed in from the command line.

The instance

To illustrate, create a test.py document:

"""Naval Fate.

Usage:
  naval_fate.py ship new <name>...
  naval_fate.py ship <name> move <x> <y> [--speed=<kn>]
  naval_fate.py ship shoot <x> <y>
  naval_fate.py mine (set|remove) <x> <y> [--moored | --drifting]
  naval_fate.py (-h | --help)
  naval_fate.py --version

Options:
  -h --help     Show this screen.
  --version     Show version.
  --speed=<kn>  Speed in knots [default: 10].
  --moored      Moored (anchored) mine.
  --drifting    Drifting mine.

"""
from docopt import docopt


if __name__ == '__main__':
    arguments = docopt(__doc__, version='Naval Fate 2.0')
    print(arguments)
Copy the code

Execute command:

python test.py ship new jb
Copy the code

Results:

{'--drifting': False,
 '--help': False,
 '--moored': False,
 '--speed': '10',
 '--version': False,
 '<name>': ['jb'],
 '<x>': None,
 '<y>': None,
 'mine': False,
 'move': False,
 'new': True,
 'remove': False,
 'set': False,
 'ship': True,
 'shoot': False}
Copy the code

Then try a command not in Usage:

Usage:
  naval_fate.py ship new <name>...
  naval_fate.py ship <name> move <x> <y> [--speed=<kn>]
  naval_fate.py ship shoot <x> <y>
  naval_fate.py mine (set|remove) <x> <y> [--moored | --drifting]
  naval_fate.py (-h | --help)
  naval_fate.py --version
Copy the code

Small nodules

  • The docopt(doc) function parses command-line arguments according to the instructions in the help documentation and returns the result as a dictionary;
  • When the user uses a command that is not in the Usage, the help document is output.
  • When you want to use it,from docopt import docoptCall;
  • Mandatory parameter,doc, the other 4 are optional (help,version,argv,options_first);

evaluate

Look at the evaluate.py file and there is this paragraph at the top:

__doc__ = ''' Usage: emotion WORD With Python: EmotionDict() --> init EmotionDict. Evaluate (word) --> tuple(STR, strength (int), polarity (int)) or NoneCopy the code

Py = test.py = test.py = test.py = test.py = test.py

Evaluate import EmotionDict test = EmotionDict() print(test. Evaluate (word=" war "))Copy the code

Run it directly, and the output is:

(' War disaster ', 'ND', 5, 2)Copy the code

Compared with Excel, the content is the same;

  • ND stands for hate;
  • The intensity of 1, 3, 5, 7, 9 was the highest, and 5 was average.
  • Polarity can be divided into four categories: 0 for neutral, 1 for positive, 2 for negative, and 3 for both positive and negative.

For others, please refer to the description. Doc.

So, the word “war disaster” is used in a derogatory way to express hatred? I don’t know why, it feels weird;

Try one sentence

One word is the above usage, what about the passage?

The most Chinese word segmentation is jieba library, do not understand the students, please move here;

A blog directly found a paragraph, combined with participles, let’s see:

participles

Seg_list = jieba.cut(" Watching the match with a position is destined to be painful, better to enjoy every wonderful moment in the match!" ,cut_all=False) print("Default Mode: " + "/ ".join(seg_list))Copy the code

Output:

Default Mode: with/stand/watch/match/destined/be/painful /, / better than/good/taste/match/every/one/wonderful/moment /!Copy the code

combination

Seg_list = jieba.cut(" Watching the match with a position is destined to be painful, better to enjoy every wonderful moment in the match!" ,cut_all=False) test = EmotionDict() for i in seg_list: print(i) print(test.evaluate(word=i))Copy the code

Output:

None is destined to be None pain (' pain ', 'NB', 7, 0) of None, Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Better than None Match None of None NoneCopy the code

Punctuation is not filtered, it doesn’t matter; A brief look, so many words, only wonderful, pain is returned content, also shows that the original thesaurus is far from enough;

And to put the corresponding PH, number corresponding, also need to write a separate conversion logic, but also filter all kinds of symbols, there are a lot of small details to do, here, the effect is really poor, mainly, the content of the word library is too little, many words are not, there is no way to judge;

Look at someone else’s

Since we don’t make our own wheels, we should look at others’ wheels. This kind of contextual analysis immediately reminds us of BAT, so let’s look at BAT together.

A certain degree of

Search for a certain degree of natural language processing, directly pop up a certain degree of AI open platform, click to look at the products and services, choose natural language processing, you can see that there are emotional tendency analysis, but also dialogue emotion recognition services, the two should be the common principle, look at the former;

Click, log in, directly click the API document, turn to the emotion orientation analysis interface;

Interface description

The text containing subjective viewpoint information is judged by emotional polarity categories (positive, negative and neutral), and the corresponding confidence degree is given.

Request instructions

Sample request

  • HTTP method: POST
  • The requested URL: aip.baidubce.com/rpc/2.0/nlp…

The URL parameter

parameter value
access_token For the access_token obtained by using API keys and Secret keys, seeThe Access Token to obtain”

The Header is as follows

parameter value
Content-Type application/json

Body request example

{"text": "Apple is a great company"}Copy the code

Request parameters

parameter type describe
text string Text content (GBK encoding), maximum 2048 bytes

Return instructions

Return parameter parameters | | – | — – | log_id | uint64 | request a unique identification code sentiment | int | said emotional polarity classification results, 0:1: negative, neutral, 2: positive confidence | float | classification of confidence, Value range of [0, 1] positive_prob | float | said belongs to the category of probability actively, value range of [0, 1] negative_prob | float | said belong to the type of negative probability, value range of [0, 1]

Return the sample

[{"sentiment":2, // Positive_prob :0.73, "Negative_prob ":0.27 // "negative_prob":0.27}]}Copy the code

The Access Token to obtain

Access Token is obtained through API Key and Secret Key. How to obtain these two tokens?

Remember the home page of Affective Propensity Analysis? There’s an instant button to create an app;

Click Create application, enter the application name and description, and then click view application details. The API Key and Secret Key above need to be used.

Access token = py2, py3 = py3, py3 = py3

import requests

url = 'https://aip.baidubce.com/oauth/2.0/token'

headers = {
                "User-Agent": Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.32 Safari/537.36"."Content-Type":"application/json"
}

params = {
    "grant_type":"client_credentials"."client_id": Your API Key,"client_secret"Request. Post (url,headers=headers,params=params) text = response.json().get()"access_token")
print(text)
Copy the code

The corresponding result is the value of access_token;

Cool a

The requests library is used for requests requests library requests. The requests library is used for requests library requests.

{'log_id': 3838837857684473751, 'error_code': 282004, 'error_msg': 'invalid parameter(s)'}
Copy the code

Finally, online for a long time, use urllib library is good, a face meng force. Post code:

import json
import urllib

# Get emotional contentAccess_token = Your access_token value URL ='https://aip.baidubce.com/rpc/2.0/nlp/v1/sentiment_classify?access_token='+access_token

headers={'Content-Type':'application/json'}

post_data = {"text":"Watching the game with a stand is bound to be painful. Instead, enjoy every moment of the game!"}
data=json.dumps(post_data).encode('GBK')

request = urllib.request.Request(url, data)
response = urllib.request.urlopen(request)
content = response.read()
content_str = str(content, encoding="gbk")
print(content_str)
Copy the code

Output:

{"log_id": 830621152984506211, "text": "Watching the game with a point of view is going to be painful. , "items": [{"positive_prob": 0.521441, "confidence": 0.571177, "negative_prob": 0.478559, "sentiment": 1}]}Copy the code

According to the official website, the meanings of the four fields are as follows:

  • Sentiment, indicates the classification result of polarity of emotion. The official did not specify it. The guess is the same as above: 0 stands for neutral, 1 for positive, 2 for negative, and 3 for both positive and negative.
  • Confidence, indicating the confidence of classification;
  • Positive_prob, the probability of being in the positive category
  • Negative_prob, the probability of being in the negative category

According to the above results, then this sentence should belong to the neutral word, partial positive;

A –

If you look directly, you’ll find Wenzhi Natural Language processing, product documentation here, API guide here, official demo, PY demo here;

Download code, github says you need security credentials, click login to get;

Then you need to install the corresponding dependency library, and you can choose from two ways:

$PIP install qcloudapi - SDK - $git clone https://github.com/QcloudApi/qcloudapi-sdk-python python installation or download the source codeCopy the code

python setup.py install

Open demo.py under tests and modify the module, interface name, and interface parameters.

#! /usr/bin/python
# -*- coding: utf-8 -*-

# Introduce the cloud API entry module
from QcloudApi.qcloudapi import QcloudApi

' ''Module: Sets the module to load'' '
module = 'wenzhi'

' ''Action: the interface name of the corresponding interface, refer to the interface name of the corresponding interface in the Wiki documentation'' '
action = 'TextSentiment'

' ''config: Cloud API public parameters'' '
config = {
    'Region': 'ap-guangzhou'.'secretId': 'AKIDmmuRdgSV8sjR0eokVh2159Kp2OiyPHPQ'.'secretKey': 'DNS9h6aBFLYo2BAEBPePI3d3IMGzb7ml',}# interface parameters
action_params = {
    "content":"Watching the game with a stand is bound to be painful. Instead, enjoy every moment of the game!"
}

try:
    service = QcloudApi(module, config)
    print(service.generateUrl(action, action_params))
    print(service.call(action, action_params))
except Exception as e:
    import traceback
    print('traceback.format_exc():\n%s' % traceback.format_exc())
Copy the code

Output:

B '{" code ": 0," message ":" ", "codeDesc" : "Success", "positive" : 0.58672362565994, "negative" : 0.41327640414238}'Copy the code
  • , positive
  • Negative, negative

Rightness, some news has no free experience, JB just is the new person got a free gift package, if not a novice, will oneself charge money, very X news;

In a certain

Click here – Sentiment Analysis, login, click Open, then go to the console;

Click the basic version, API debugging:

Choose API, here about emotion analysis, only e-commerce, other have nothing to do with emotion:

Enter your Access key and Secret as required and click Debug:

For an explanation of the response, click here to see the polarity parameter values, so the example is negative, which makes sense;

And then the rest of it, if you’re interested in buying 270 of them, if you’re interested in using them, that’s it;

Ali provides online debugging, more convenient, but the type is too few, and not detailed enough, the result is positive, negative, neutral 3 choose 1, once one day there is a Bug, miserable;

Regularly climb micro blog

This chapter does not want to cover too much content, the idea has been mentioned before, just combine the code, please refer to the following two articles for details:

JB Python journey – douban automatic top post function JB Python journey – crawler – Sina Weibo content crawl

The results of the above three platforms are obvious, but only up to a certain point. After all, it’s free;

The server sauce pushed to wechat is directly pasted with the code:

#! /usr/bin/python3
# -*- coding: utf-8 -*-
import re
from json import JSONDecodeError
import time
import requests
from apscheduler.schedulers.blocking import BlockingScheduler
import json
import urllib


wb_url = "Https://m.weibo.cn/profile/info?uid= you need to focus on weibo user id"
server_url = "http://sc.ftqq.com/ your server sauce.send"

# Get emotional content
access_token='Your Baidu Access_token value'
bd_url = 'https://aip.baidubce.com/rpc/2.0/nlp/v1/sentiment_classify?access_token='+access_token

wb_headers = {
    "Host": "m.weibo.cn"."Referer": "https://m.weibo.cn/u/ whatever, usually is the id of the weibo user you want to follow."."User-Agent": "Mozilla / 5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko)"
                  "Version 9.0 / Mobile / 13 b143 Safari / 601.1",
}

wb_params = {
    "text": "{text}"."desp": "{desp}"
}

statuses_id = ""
scheduler = BlockingScheduler()
page_size = 10


def get_time():
    """
    获取当前时间
    """
    return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())


def push_wx(text=None, desp=None):
    ""Text: title :param desp: content :return:"""
    wb_params['text'] = text
    wb_params['desp'] = desp

    response = requests.get(server_url, params=wb_params)
    json_data = response.json()

    if json_data['errno'] = = 0:print(get_time() + "Push succeeded.")
    else:
        print(json_data)
        print("{0} push failed: {1} \n {2}".format(get_time(), json_data['errno'], json_data['errmsg']))


def filter_emoji(text, replace="") :"""Filter Emoji: Param text: original text: param replace: replace Emoji with this content :return: filtered content"""

    try:
        co = re.compile(u'[\U00010000-\U0010ffff]')
    except re.error:
        co = re.compile(u'[\uD800-\uDBFF][\uDC00-\uDFFF]')
    return co.sub(replace, text)


def get_desp(user, statuse):
    """Get twitter content"""

    global text;
    global nick_name;

    # Personal Information
    avatar = user['profile_image_url']  # head
    nick_name = user['screen_name']  # nickname
    follow_count = user['follow_count']  # attention
    followers_count = user['followers_count']  # fans
    description = user['description']  # Personal signature

    # Weibo message
    image = ""
    created_at = statuse['created_at']  # time
    source = statuse['source']  # Tweet-sending device


    # Weibo content
    if 'raw_text' in statuse:
        print(statuse)
        text = statuse['raw_text']
    else:
        text = statuse['text']

    text = filter_emoji(text, "[emoji]")

    # get images
    if 'pics' in statuse:
        pics = statuse['pics']
        for pic in pics:
            image += ! "" []({0})\n\n".format(pic['url'])

    return ! "" [] ({0}) \ n \ n# # # {1} \ n \ n: {2} and the fans: {3} \ n \ n signature: {4} \ n \ n send time: {5} \ n \ n equipment: {6} \ n \ n weibo content: {7} \ n \ n \ n \ n {8}" \
        .format(avatar, nick_name, follow_count, followers_count, description, created_at, source, text, image)


def start_task():
    # print(" Execute query task ")
    response = requests.get(wb_url, headers=wb_headers)

    try:
        json_data = response.json()
    except JSONDecodeError as e:
        print(get_time() + Json parsing exception, skip this loop: + str(e))
        return

    state = json_data['ok']

    ifstate ! = 1: push_wx(get_time() +"Your girlfriend died again. Status code:" + str(state) + "Go and see."."")
        scheduler.remove_job('wb')
        return

    data = json_data['data']
    user = data['user']
    statuses = data['statuses']

    size = len(statuses)

    if size < page_size:
        print(get_time() + "Incorrect data returned. Skip this loop. size:" + str(size))
        return

    first_statuse = statuses[0]
    new_id = first_statuse['id']

    global statuses_id

    ifnew_id ! = statuses_id:print(get_time() + "There's a new tweet! id-> " + new_id)

        # Get twitter information
        desp = get_desp(user, first_statuse)
        title = "Goddess updated her Twitter."

        release_text = SentimentAnalysis()
        push_wx(title, release_text+desp + "\ n \ n [the original weibo] (https://m.weibo.cn/profile/2105667905)")

        statuses_id = new_id


def SentimentAnalysis():
    post_data = {"text": text}
    data = json.dumps(post_data).encode('GBK')

    request = urllib.request.Request(bd_url, data)
    response = urllib.request.urlopen(request)
    content = response.read()
    content_str = str(content, encoding="gbk")
    data = json.loads(content_str)

    # Probability of positivity, negativity, credibility
    positive_prob = '%.2f%%' % (data["items"] [0] ["positive_prob"] * 100)
    negative_prob = '%.2f%%' % (data["items"] [0] ["negative_prob"] * 100)
    confidence = '%.2f%%' % (data["items"] [0] ["confidence"] * 100)
    sentiment = data["items"] [0] ["sentiment"]

    if (positive_prob > negative_prob):
        prob = positive_prob

    elif (positive_prob < negative_prob):
        prob = negative_prob
    else:
        prob = positive_prob

    if (sentiment == 0 ):
        prob_text = "Negative"
    elif (sentiment == 1 ):
        prob_text = "Neutral"
    elif (sentiment == 2):
        prob_text = "Positive"


    analysis_text = "Your goddess blogger:"+nick_name + ", posted a mood score of"+prob+", suspected to be"+prob_text+"Emotional micro-blog, come and see, credibility:"+confidence+The original tweet read:+text
    return analysis_text


if __name__ == '__main__':
    print(get_time() + "SAO year, nightmare attack!")
    scheduler.add_job(start_task, "interval", seconds=6, id="wb")
    scheduler.start()
Copy the code

Code can not be used directly, to manually enter several values, weibo user ID, a degree of access_token, server sauce, end;

rendering

%

Through the above push information, the information is maximized and the corresponding emotional value is also obtained. However, the meaning of what a woman says depends on different situations.

For example, the quarrel when breaking up, in fact, is to coax you, you embrace; For example, after getting married, do not give up, secretly buy it;Copy the code

This meaning cannot be judged without context;

The woman’s mind, do not guess, buy/coax/lick on the right;

By the way, the premise is to have a boyfriend/girlfriend, otherwise, or buy skin care products to comfort your right hand;

summary

This article mainly introduces the content of mood analysis, manual statistics, but also the use of BAT platform interface, out of a certain degree of free interface to provide, other charges, and not low, used for debugging or internal use, with a certain degree of good, the amount may charge, but did not find specific documents, do not struggle;

At the same time, I learned the DOCopt module of Py, which will extract the content of the help information, and then parse the parameters passed in from the command line.

While in the trial of BAT platform, it will be found that all calling interfaces need security certificate/authorization verification, and the purpose is security. This is worth learning. Recall, whether internal interfaces can be directly called without verification? Is it possible to be used by third parties?

Best, I wish a girlfriend, happy, avoid all obstacles, pull buried skylight as soon as possible; Don’t have a girlfriend, learn to chat, keep confident, don’t be too rigid, the most important thing is to have ambition, sunshine vitality, empathy, if you are a girl, you will like yourself?

Finally, thank you!