Author: Caroline Brun
takeaway
This paper makes a very good summary and prospect for the research field of sentiment analysis.
Fake news, popularity, health, financial and social risk: sentiment analysis is helping to make sense of it all. May even lead us to understand why…
It’s natural that people are interested in each other’s opinions right from the start. However, the scientific work of analyzing opinions dates back to the early 20th century, when attempts were made to capture, analyze and quantify public opinion from questionnaires. At the same time, the academic journal Public Opinion Quarterly was published in 1937. However, it was the emergence and adoption of social media platforms that led to the creation of the research field of “sentiment analysis” to analyze this massive online source of unstructured opinion. \
In general, sentiment analysis uses text analysis to obtain public opinion. It is one of the most attractive use cases for natural language processing (NLP), with both industry and academia interested in it. In sentiment analysis, NLP-based data mining processes and techniques are applied to extract and analyze subjective information from user-generated content (UGC), much of which comes from social media (because there is so much of it).
It allows you to measure feelings (opinions, but also feelings) for certain products, people, or ideas. Sentiment analysis has traditionally been about polarity of opinion, whether a person has a positive, neutral or negative view of someone or something, but it can also be about specific aspects of a person or object.
Its popularity naturally stems from its wide range of uses. Business applications such as customer service, business intelligence and product or brand reputation management are particularly prominent. In healthcare, it could be used to detect abnormalities in adverse drug reactions. It can help monitor criminal activity or sentiment in financial markets, but it can also measure how the public views political candidates.
Today, any event can be posted, viewed, commented on and shared at lightning speed on social media, potentially reaching millions of people. Sentiment analysis is an important tool to help people understand all events and react when necessary.
How has sentiment analysis evolved over time
With the development of social media channels, the research tasks and methods of sentiment analysis are also increasing. In its early days, sentiment analytics simply assigned a global, comprehensive polarity label (positive, negative, and sometimes neutral) to English language customer comments. Current research includes sentence-level topic detection, sentiment analysis based on different aspects, sentiment analysis in figurative language, subject-based polarity classification, and implicit polarity classification of events, such as the identification of “Pleasant” or “UNPLEASANT” events without explicit polarity markers being mentioned. Sentiment categorization is now more about testing positions and mining arguments in a wide variety of languages and media sources (using Twitter data has become a necessity). The definition of tasks has subsequently evolved into more complex challenges where subjectivity, polarity identification, and opinion mining have been enriched with fine-grained aspects and subject-based predictions. The concept of polarity has been supplemented by emotional models defined in psychological research.
Methods, algorithms and resources for sentiment analysis are also evolving. Existing research has yielded techniques for many different tasks, including both supervised and unsupervised approaches. In the supervised environment, early papers used various supervised machine learning methods (such as support vector machines, maximum entropy, naive Bayes, etc.) and feature combinations. Unsupervised methods include those using affective vocabulary, grammatical analysis, and syntactic models. In recent years, the success and popularity of deep learning in other fields has led to its use in sentiment analysis, often using word embedding to represent input text.
In the field of emotion classification, recursive neural networks (RNN), especially long and short term memory (LSTM) networks and their ability to capture long distance dependencies have achieved the latest achievements in polarity classification. Attention models have also been shown to provide interesting results because they can capture important information about aspects in sentences.
NLP coexists with deep learning methods. Classical NLP methods use prior knowledge of language to reduce the level of supervision to ensure accuracy in a variety of tasks including emotion analysis. The main disadvantage of deep learning methods is that they require large amounts of annotated data. This implies costs, especially for complex and structured semantics. In view of this, the current research trend advocates the integration of prior grammar knowledge into deep learning framework for text analysis, and some work has yielded interesting results in emotion analysis. \
Long-standing challenges and new frontiers
Although more and more studies have made progress in emotional analysis, dealing with “affective phenomena” in texts, such as subjectivity, aspects, emotion, mood, tone, attitude and feeling, has proved to be a complex and interdisciplinary problem that is far from being solved. Many parameters must be considered, such as the author’s profile, text type, style, domain, document source, target language, and ultimate application target. There is also a gap between publicly available experimental results, usually obtained in a relatively favorable environment, and results obtained by the system in a real-world environment.
Natural language
The main barrier to accurate mood analysis has always been, and still is, natural language, for many reasons.
Natural language is ambiguous and words may have different polar directions depending on context and domain. For example, the adjective “predictable” may be negative when describing the end of a movie but positive when describing the quality of the product.
To express one’s point of view, people often use figurative language, such as irony and sarcasm. These are challenging tasks for NLP, in which machine learning methods are easily misled by words with strong polarity, but which are used sarcastically (meaning that opposite polarity is intended).
Negation (expressions of falsity) and modality (expressions of necessity, admissibility, and probability, such as “should be” or “could be “) are complex linguistic phenomena that have a great influence on the semantics of the expressions used to express ideas. The modality of dealing with negative scenarios and situations is particularly important in sentiment analysis.
Understanding context is essential to understanding ideas. Coreference resolution, which is the ability to identify who or what a pronoun or noun phrase refers to, is a well-known challenge for NLP techniques and an important step in understanding ideas.
Finally, UGC is full of implied emotions (factual expressions that suggest positive or negative emotions), such as inference in “She is still looking for another Oscar nod. Not here though.” These expressions are related to facts or actions that are available and not available, but don’t use self-righteous words, which means it’s hard to automatically capture them.
Challenging task
Sentiment analysis is challenging by nature, but there is growing interest in other related tasks that can be more difficult.
Aspect-based sentiment Analysis (ABSA) aims to capture the emotions expressed in user-generated reviews about different aspects of entities such as products, movies, companies, etc. An Aspect is an attribute of an entity, such as the screen of a phone (relative to its weight or size, etc.), the service of a restaurant (relative to location or price, etc.), or the image quality of a camera, etc. It can be described by an ontology associated with an entity. ABSA means recognizing different aspects of an entity and the corresponding emotions. Interest in this task has increased recently, especially with the SemEval Challenge dedicated to this aspect. In addition to basic baseline testing, it is becoming a “standard” task for sentiment analysis. We developed an ABSA system that achieved the best results in the 2016 SemEval Challenge. We now integrate it into the map search engine to create a perceptual map search for points of interest. We also investigated the ABSA on the final set of the application of evaluation, and created a new ABSA annotation data set (data) based on FourSquare, can from this web site (www.europe.naverlabs.com/Research/Na…
Emotion analysis detects and identifies emotion types in text, such as anger, disgust, fear, happiness, sadness and surprise. According to American psychologists Paul Ekman and Wallace V. Friesen, these six basic emotions are the most widely used. One of the biggest challenges here is that in most cases the emotion is implicit in the text, for example, a sentence may have an element of “anger” without using the word “anger” or any synonyms for it. This, combined with unannotated textual data, makes it even more difficult. Standard classification techniques are commonly used today, with resources such as WordNet Affect or SentiWordnet combined with implicit emotions to gain common sense.
Spam and False Detection: Fake reviews and fake news are a closely related phenomenon that both consist of writing and spreading false information or beliefs. The biggest challenge here is the lack of an effective way to distinguish genuine reviews from fake ones. Even humans have a hard time telling the difference. Once again, we face a serious lack of ground Truth data sets to help us. Most methods focus on the content of the comment (length of the comment, specific vocabulary, part of speech, etc.) and the behavior of the reviewer (e.g. time of publication, frequency of publication, first reviewer of the article, etc.).
Multilingual Sentiment Analysis Most current sentiment analysis systems usually involve only English, but online opinions exist in many more languages. Using sentiment analysis tools in only one language greatly increases the risk of losing important information written in other languages. In order to solve this problem, current methods mainly combine polar information with multi-language word embedding.
Multimodal sentiment Analysis With the proliferation of social multimedia, multimodal sentiment analysis will present new opportunities to integrate other complementary data streams, such as the display and expression of faces and voices, to express emotions, often in a very powerful way. Doing so could not only improve text-based sentiment analysis, but even surpass it. The difficulty lies in how to extract visual emotion (low resolution, subject motion) in a real environment, and how to reliably extract linguistic and paralinguistic features from audio text.
Real-time sentiment analysis: The world generates vast amounts of real-time data every second, much of it unstructured text messages. If we can analyze this data in real time, we can not only quickly find answers to questions, but also solve them in real time. This will require the development of specialized preprocessing or distributed architectures that are themselves dedicated to analyzing algorithms online.
Finally, argument mining is one of the most challenging directions for future sentiment analysis techniques. Sentiment analysis is to understand users’ views on certain aspects, and the purpose of demonstration is to find out the reasons for these views and the overall reasoning path. \
The main goal is to automatically extract parameters from the generic text corpus and provide structured data for the calculation models of parameters and inference engines. In theory, argument mining can uncover knowledge that allows us to discover the “legitimacy” of general ideas, for example (why people think the way they do), generate fine-grained debate graphs for complex political questions, or improve general opinion mining algorithms. Argument mining is highly relevant to another emerging category of task positions, whose (simpler) goal is to determine whether the author of a review supports the review’s (often) controversial target topic. Mining arguments is a challenging task because it requires a great deal of common sense, global knowledge, domain knowledge, and context knowledge. Many argument models have been proposed and applied to automatic recognition, which are the basis of annotation of text argument.
In recent years, deep learning model has been widely used in argumentation mining of model context, which is of great help to global knowledge acquisition. However, these models have limitations in automatically deriving common sense and global knowledge from textual data. Argumentation mining can be seen in some ways as an evolution of sentiment analysis: opinion mining aims to understand what people think about something, while argumentation mining aims to understand why, that is, mining people’s pros and cons to reveal the reasoning process.
Emotion analysis is one of the most active research fields in natural language processing, but it is far from being a solvable problem. It involves a deep understanding of lexical, syntactic, and semantic rules, combined with background knowledge. In the context of big data, the inherent complexity of natural language and the new challenging task of emotion analysis mean that there are more fascinating research perspectives on emotional language understanding than ever before. What I find most inspiring are complex issues such as detecting hidden emotions, being able to handle multiple languages, deception detection, real-time event analysis, and automatic acquisition of common sense, global, and context knowledge.
— the END —
English text: europe.naverlabs.com/blog/new-ho…
Machine Learning Online Manual Deep Learning online Manual AI Basics download (PDF updated to25Set) site QQ group1003271085To join the wechat group, please reply to "add group" to get a discount station knowledge planet coupon, please reply to "knowledge planet" like the article, click on itCopy the code