“This is the 13th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021”
Master leads the door
【 ECAI 2020.09.04 】 Knowledge Graph Embeddings: From going to the to Practice www.bilibili.com/video/BV17z…
1. KGE theory
1.1 the introduction
- KG
- Introduction to CWA and OWA: the Closed world hypothesis and the open world hypothesis
- KG’s main tasks on ML: join prediction/triplet classification (key), cooperative node classification/link clustering, entity matching
- Focus on [connection prediction (information search task – sorting task), triplet classification (classification task – binary classification task)]
- The traditional relational learning methods: based on rule constraint and based on graph model (CRF, PRM, RMN, RDN) have the disadvantage that the methods are not differentiable due to the limited size of KG, and the model capacity is limited.
- Drawing on the idea of representation learning, relational learning method based on graph representation learning: Node Representation (PRA, LINE, DeepWalk, NODE2VEC), Graph Neural Networks (GNNs, GCNs, Graph Attention Networks), Knowledge Graph Embeddings (TransE, DistMult, ComplEX, ConvE)
- A brief look at KGE’s experience
1.2 KGE model
- Overall framework (optimizer, sorting layer, scoring function, loss function, negative sampling)
- Scoring function
- Translation-based Scoring Functions {TransE, RotatE} Translation model
- Factorization-based Scoring Functions {RESCAL, DistMult, ComplEx(ComplEx space)} decomposition model
- Depth model of Deeper Scoring Functions (ConvE, ConvKB)
- Other Recent Models(HolE, SimplE, QuatE, MurP…)
- Loss function
- Hing Loss for space-hinge Loss function (soft interval derived from SVM)
- Negative sample-likelihood function/cross entropy
- Binary cross entropy
- Self adversarial loss function
- .
- Regularization method
- L1, L2, L3
- Dropout(ConvE)
- Initialization method
- Random(Uniform, Normal) is initialized with a different distribution function
- Glorot
- Negative sampling
- Local ClosedWorld Assumption Local ClosedWorld Assumption
- It assumes that the relation is unique and that different S and O are selected as incorrect (degenerate) objects.
- Methods: Uniform sampling (random), Completeset (Completeset), 1-n scoring (batch)
- Training steps and optimizers
- Optimizer: SGD(Adam,…)
- Reciprocal Triples
- Model selection
- The grid search
- Random search
- Quasi search plus Bayes
1.3 Evaluation and evaluation
- Evaluation Metrics
- The smaller the Mean Rank (MR), the better
- Mean frame Rank (MRR) is better
- Hits@N the bigger the better
- Benchmark Datasets
- FB15K – 237
- WNI18RR
- YAGO3-10
- SOTA Results
- .
1.4 Advanced KGE Topics
- Calibration Calibration
- It is mainly used in scenarios requiring high interpretability of models, such as drug discovery, protein prediction, and financial risk control
- Multimodal Knowledge Graphs
- Temporal Knowledge Graphs
- Uncertain Knowledge Graphs
- 20. Robustness
- KGE& Neuro-Symbolic Reasoning KGE& Neural Symbolic Reasoning
- Neural Theorem Provers (NTP
- Analogical Reasoning
- Query2vec using Embeddings
1.5 Future research orientation
- And the MORE EXPRESSIVE MODELS
- SUPPORT FOR MULTIMODALITY
- ROBUSTNESS & INTERPRETABILITY
- BETTER BENCHMARKS
- BEYOND LINK PREDICTION Current status BEYOND existing LINK PREDICTION
- Neuro-symbolic INTEGRATION NEURO-SYMBOLIC INTEGRATION
2. KGE application
- Pharmaceutical Industry
- Human Resources
- Products
- Food & Beverage
3. Introduction to KGE related software dependency libraries
- OpenKE
- AmpliGraph
- DGL-KE
4. KGE practice
Software used: AmpliGraph Python environment: PYTHon3.7 >= 3.7 Deep learning environment: Tensorflow >= 1.15.2
5.TransE Code
TransE is a staple of KGE (2013), the borrowed word2vec paper the thought of “semantic translation deformation”, transfer the translational invariance to triple < s, p, o >, through | | h + r – t | | = = 0 and vectorization knowledge map to determine the relationship between said. Many subsequent KGE models were improved on TrasnE’s basis.
The downside of TransE is its inability to solve one-to-many, many-to-many relationships.
How to understand one-to-one, one-to-many, many-to-many? B station has an up explanation very well. link
- If we simply consider h+r==t then sima Yi’s concubines would have similar vectors, and it would appear that Madam Zhang is equal to Madam Fu.
- When (Cao Cao appreciating Sima Yi) is equivalent to Cao Cao equals Sima Yi. Of course, this will not be the case in the real iteration, because most of the examples similar to (Cao Cao appreciating Sima Yi) are in the real environment. So (Cao Cao – appreciation – Cao Cao) will be treated as noise processing to affect the model effect.
Methods a lightKG
LightKG profile
- LightKG is a knowledge graph deep learning framework based on Pytorch and TorchText. LightKG covers some simple algorithms in the field of knowledge graph. It is lightweight and simple, suitable for beginners in the field of knowledge graph. Making the address
- When using, import algorithm model from lightKG first, create corresponding model object, then call the model object train method to train the model, and save the trained model; During the test, the trained model is first loaded and then tested, taking model Y under x directory as an example:
from lightkg.x import Y # import algorithm model
Ymodel = Y() Create a model object
Ymodel.train(...) # Train the model and save
Ymodel.load(...) Load the trained model
Ymodel.test(...) # Test model
Copy the code
- Method overview:
TransE regards relations as transitions between entities, i.e. if the triplet (head, relation, tail) is true, the sum of the head vector H and the relation vector R is close to the tail vector T, otherwise far away.
Task description
- Knowledge in the knowledge graph is usually expressed in the form of triples (head entity, relation entity, tail entity).
- Link prediction is designed to predict the third element given any two elements of a triplet, i.e. (? , relation, tail entity), (head entity, relation,?) And (head entity,? , tail entity), where? The elements to be predicted are called head entity prediction, tail entity prediction and relation prediction respectively.
- Data set: A link prediction data set from Github, which can be downloaded here. Each row has three fields, separated by “, “, respectively representing the header entity, the relation entity, and the tail entity. The data sample is as follows:
Science, contains, natural, social, thinking and other fields of science, foreign language name,science science, pinyin, K ē xue science, Chinese name,science science, explanation, discovery, accumulation of the truth of the application and practiceCopy the code
The required depend on
Test environment Python 3.6.8 Pytorch 1.4.0# Required dependencies --Torchtext >= 0.4.0tqDM >=4.28.1 Torch >=1.0.0 PyTorch_crf >=0.7.0 scikit_learn>=0.20.2 networkx>=2.2 Revtok jieba regex ---------------Install the lightKG library as follows before running! PIP install -i lightKG at https://pypi.douban.com/simple/.Copy the code
TransE training
Based on the above transfer hypothesis, TransE designed the triplet’s score function as follows: F (h, r, t) = – ∣ ∣ h + r – t ∣ ∣ L1 / L2f (h, r, t) = {| | \ mathbf h + \ mathbf r – \ mathbf t | |} _ {L_1 / L_2} f (h, r, t) = – ∣ ∣ h + r – t ∣ ∣ L1 / L2
The distance is measured with the L1L_1L1 or L2L_2L2 norm (the L1L_1L1 norm is used in this tutorial). The score function measures the likelihood of a triple being effective, and the higher the score, the more likely the triple is to be effective. Thus, positive triads score high and negative triads score low. Because of the relatively small number of relationships, negative cases are obtained only by substituting header or tail entities.
Based on the above principles, the positive case triplet and its corresponding negative case triplet in the knowledge graph are modeled, which is similar to support vector machine. The margin-based loss is minimized so that the score of the positive case is at least one Margin γ\gammaγ higher than that of the negative case, namely:
△\ BigTriangleup △ is the set of positive example triples in the knowledge graph, △ ‘\ Bigtriangleup ‘△’ is the set of negative example triples based on △\bigtriangleup△ by replacing the head entity or tail entity of the positive example triples. Minimizing this loss yields vector representations of entities and relationships.
View the data set
import pandas as pd
train = pd.read_csv('datasets/train.sample.csv',header=None)
train.head(20)
Copy the code
import os
from lightkg.krl import KRL
from lightkg.krl.config import DEFAULT_CONFIG
Change epoch to 1000 by default
DEFAULT_CONFIG['epoch'] =10
# data path
dataset_path = 'datasets/train.sample.csv'
model_type = 'TransE'
Initialize the instance
krl = KRL()
if not os.path.exists('./temp/models'):
os.makedirs('./temp/models')
# training
krl.train(dataset_path,
model_type=model_type,
dev_path=dataset_path,
save_path='./temp/models/LP_{}'.format(model_type))
Copy the code
To predict
# Read model
krl.load(save_path='./temp/models/LP_{}'.format(model_type), model_type=model_type)
krl.test(dataset_path)
Copy the code
After reading the model, the corresponding predict_head, predict_tail and predict_rel can be called to predict the head and tail entities or relations.
# print function
def topk_print(l,name) :
print("{} forecast:".format(name))
for i in l:
print(i)
print(a)# Head entity prediction
head_pred = krl.predict_head(rel='Foreign name', tail='science', topk=3)
# Tail entity prediction
tail_pred = krl.predict_tail(head='science', rel='Foreign name', topk=3)
# Relationship prediction
relation_pred = krl.predict_rel(head='science', tail='science', topk=3)
print("Triples: Science - Foreign Names - Science \n")
topk_print(head_pred,"Head entity")
topk_print(tail_pred,"Tail entity")
topk_print(relation_pred,"Relationship")
Copy the code
Method 2 AmpliGraph
ECAI 2020 gives excellent video tutorials and tutorials on Jupyter Notebook
Note: ampligraph requires tensorflow1.14.0 and above
ECAI_2020_KGE_Tutorial_Hands_on_Session.ipynb
Method 3 dgl-KE (Command line training)
DGL ke is an Amazon KGE tool that relies on the DGL framework. DGL has been adapted to PyTorch, MxNet, and TensorFlow deep learning frameworks. However, DGL KE is currently only compatible with PyTorch and can only be executed on Ubuntu or maxOS via the command line. Colab can be trained again, anyway just get an Embedding.
Train KGE using COLab to load DGL-KE
Create a data folder on the command line, manually or request files
# Refer to https://github.com/MaastrichtU-IDS/KGRulEm/blob/7a696485f9506ba6af886b6cc86658a5fa6c696b/embeddings/Train_embeddings.i pynb! mkdir my_task# Handle custom files
import os
import numpy as np
import pandas as pd
triples_path = "./data/freebase-237-merged-and-remapped.csv"
df = pd.read_csv(triples_path, names=['sub'.'pred'.'obj'])
triples = df.values.tolist()
print(len(triples))
# Please make sure the output directory exist.
seed = np.arange(num_triples)
np.random.seed(Awesome!)
np.random.shuffle(seed)
train_cnt = int(num_triples * 0.9)
valid_cnt = int(num_triples * 0.05)
train_set = seed[:train_cnt]
train_set = train_set.tolist()
valid_set = seed[train_cnt:train_cnt+valid_cnt].tolist()
test_set = seed[train_cnt+valid_cnt:].tolist()
with open("./data/FB15K237_train.tsv".'w+') as f:
for idx in train_set:
f.writelines("{}\t{}\t{}\n".format(triples[idx][0], triples[idx][1], triples[idx][2]))
with open("./data/FB15K237_valid.tsv".'w+') as f:
for idx in valid_set:
f.writelines("{}\t{}\t{}\n".format(triples[idx][0], triples[idx][1], triples[idx][2]))
with open("./data/FB15K237_test.tsv".'w+') as f:
for idx in test_set:
f.writelines("{}\t{}\t{}\n".format(triples[idx][0], triples[idx][1], triples[idx][2]))
Use the command line! DGLBACKEND=pytorch dglke_train --dataset FB15K237 --data_path ./data --data_files FB15K237_train.tsv FB15K237_valid.tsv FB15K237_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --dataset FB15K237 --batch_size 1000 \
--neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 500 --log_interval 100 \
--batch_size_eval 16 -adv --regularization_coef 1.00 e-09 --test --num_thread 1 --num_proc 8
Copy the code
Additional code implementation resources
[2] TransE knowledge graph completion, FB15K-237 data set (Python implementation) [3] Analysis of the basic code of knowledge graph Embedding technology [4] [KG note]
NLP cute new, shallow talent, mistakes or imperfect place, please criticize!!