The article source | turbine cloud community

The original address | PRGC: based on the potential and corresponding joint global relations triples extraction

The original author | Mathor


Abstract

In this paper, the task of relation extraction is divided into three sub-tasks: relation judgment, entity extraction and subject-object alignment. A joint relation triplet extraction framework (PRGC) based on potential relation and global correspondence is proposed. Specifically, a component that predicts potential relationships is designed first, and subsequent entity extraction is limited to a subset of predicted relationships instead of all relationships. Then, relations-specific sequence markup components are used to deal with the overlap between subject-objects. Finally, a global counterpart is designed to align subjects and objects into triples with low complexity. New SOTA was achieved on two common data sets.

1 Introduction

Relational extraction is to identify (subject, relation, object) triples from unstructured text. In this paper, it is decomposed into three sub-tasks: 1. 2. Entity extraction: identify subject and object in sentences; 3. Subject-object alignment: Align the subject-object into a triplet

For relational judgment: In this paper, the Potential Relation PredictionPotential\ Relation\ PredictionPotential Relation Prediction component is used to predict the Potential Relation, instead of reserving all the redundant relations. This reduces computational complexity and results in better performance, especially in entity extraction. In entity extraction: This article uses a more robust Relation Specific Sequence TagRelation\ Specific\ Sequence\ TagRelation Specific Sequence Tag component (rel-spec for short) Sequence Tag) to extract subject and object, respectively, to naturally handle the overlap between subject and object. For subject-object alignment: this paper designs a global correspondence matrix independent of a relation to determine whether a particular subject-object pair is valid in a triplet.

Given a sentence, PRGC first predicts a subset of potential relationships and a global matrix containing the corresponding fractions between all subject-objects. Then sequence annotation is carried out to extract the subject and object of each potential relationship in parallel. Finally, all predicted entity pairs are enumerated and pruned through the global corresponding matrix.

2 Method

2.1 PROBLEM DEFINITION

Input is a sentence with n tokens S=x1,x2… XnS = (x_1, x_2,… , x_n} S = x1, x2,… , xn, expected output is relation triples T (S) = (S, r, o) ∣ S ∈ E, o r ∈ RT (S) = {(S, r, o) | S, o \ in E, r \ r} in T (S) = (S, r, o) ∣ S ∈ E, o r ∈ r, EEE, RRR said entity set and relationship set respectively.

2.1.1 base Judgement

For a given sentence SSS, the subtask is to predict the potential relationships contained in its sentence SSS, and the output is: Yr(s)= R1, R2… , the rm ∣ ri ∈ RY_r (s) = {r_1, r_2,… , r_m | r_i \ R} in Yr (s) = r1, r2,… Rm ∣ri∈R where m is the size of the subset of potential relations.

2.1.2 the Entity Extraction

For a given sentence SSS and the predicted potential relationship RIR_IRI, the subtask is to identify each token’s tag using the BIO tag scheme, where TjT_Jtj represents the tag. The output is Ye(S,ri∣ri∈R)= T1, T2… , tnY_e (S, r_i | r_i \ in R) = {t_1, t_2,… , t_n} Ye (S, ri ∣ ri ∈ R) = t1, t2,… , tn.

2.1.3 the Subject – object Alignment

For a given sentence SSS, this subtask predicts the corresponding score between the initial tokens of subject and object. That is, subject-object pairs in true triples score higher. The output is: Ys(S)=M∈Rn×nY_s(S)=M \in R^{n\times n}Ys(S)=M∈Rn×n, where MMM represents the global corresponding matrix.

2.2 PRGC ENCODER

The sentence S is encoded by BERT. Encoder output: Yenc(S)=h1,h2… , hnY_ (S) = {enc} {h_1, h_2,… , h_n} Yenc (S) = h1, h2,… ,hn, where n represents the tokens number.

2.3 PRGC DECODER

2.3.1 Potential Relation Prediction

In the figure
R p o t R^{pot}
Denotes potential relationship

Given a sentence SSS, first predict the subset of potential relationships that may exist in the sentence, and then extract only the entities that use those potential relationships. Given n tokens embedded in h∈Rn×dh\in \mathbb{R}^{n\times d}h∈Rn×d

Where AvgpollAvgpollAvgpoll is the average swimming operation, Wr∈Rd×1\mathrm{W}_r\in \mathbb{R}^{d\times 1}Wr∈Rd×1 is the trainable weight, σ\sigmaσ is the SIGMOD function.

In this paper, the potential relationship prediction is modeled as a multi-label binary classification task. If the probability exceeds a threshold λ1\lambda _1λ1, label 1 is assigned to the corresponding relationship, otherwise the corresponding relationship label is set to 0. Next, you only need to apply relationship-specific sequence tags to predict relationships, not all of them.

2.3.2 base – Specific Sequence Tagging

As shown in Figure 1, several relationship-specific sentence representations describing potential relationships are obtained from the components in Section 2.3.1. The model then performs two sequence annotation operations to extract the subject and object, respectively.

The reason why the author extracts subject and object separately is to deal with a special overlap mode, that is subject object overlap (SOO). The author abandons the traditional LSTM-CRF network and uses a simple fully connected neural network for entity relation recognition. This component does the following for each token:

Where uj∈Rd×1u_j\in \mathbb{R}^{d\times 1}uj∈Rd×1 is the JTH relation representation of the training embedding matrix U∈Rd×nrU\in \mathbb{R}^{d\times n_r}U∈Rd×nr. Nrn_rnr is the size of the whole set of relations, hi∈Rd×1h_i\in \mathbb{R}^{d\times 1}hi∈Rd×1 is the encoded representation of the i-th token, Wsub,Wobj∈Rd×3W_{sub},W_{obj}\in \mathbb{R}^{d\times 3}Wsub,Wobj∈Rd×3 are training weights

2.3.3 Global Correspondence

After sequence labeling, all possible subjects and objects about sentence relationships are obtained, and then the global correspondence matrix is used to determine the correct subject and object pairs. It should be noted that the global correspondence matrix can be learned at the same time as the potential relationship prediction because it is independent of the relationship.

The specific process is as follows: First, enumerate all possible subject-object pairs; The corresponding score of each subject-object pair is then checked in the global matrix, and if the value exceeds some threshold λ2\lambda _2λ2, the score is retained, otherwise it is filtered out. The green matrix M∈Rn× nm-in-mathbb {R}^{n\times n}M∈Rn×n is the global corresponding matrix, which consists of n tokens. Each element in the matrix is related to the starting position of the subject-object pair, and the position represents the confidence of the subject-object pair. The higher the value is, the higher the confidence of the triplet is. The value of each element in the matrix is as follows:

One hisub hjobj ∈ Rd x 1 h_i ^ {sub}, h_j ^ {obj} \ \ mathbb in ^ {R} {d \ times 1}hisub,hjobj∈Rd×1 is the coded representation of the i-th token and JTH token in the input statement forming a potential subject-object pair, Wg∈R2d×1W_g\in \mathbb{R}^{2d\times 1}Wg∈R2d×1 is the trainable weight, σ\sigmaσ is the sigmod function.

2.4 TRAINING STRATEGY

Where nrN_RNR represents the size of relation set, nrPOTn_R ^{POT} nrPOT represents the size of potential relation subset, and the total loss is:

3 Experiments

This paper has passed
P R G C R a n d o m PRGC_{Random}
To verify the validity of the PRGC decoder, where all parameters of the encoder BERT are randomly initialized.
P R G C R a n d o m PRGC_{Random}
Even without the pre-trained BERT language model, our decoder framework is still more competitive and robust than other decoder frameworks.

Specific parameters of the model: Use Bert-Base as encoder, sentence length as 100, V100GPU, 100 epochs

4 revelation

  1. Extracting potential relationships first, then extracting entities related to potential relationships and finally subject-object alignment can improve the decoding speed and computing power resources of the model.
  2. The higher the potential relationship prediction threshold λ1\lambda _1λ1, the better the performance of the model
  3. The tuning of three loss functions is a matter of work.
  4. If the sentence length is too long, the alignment of subject-object will consume large space resources.