Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”
This article also participated in the “Digitalstar Project” to win a creative gift package and creative incentive money
First, the address of the paper
Bandyopadhyay S, Lokesh N, Murty M N. Outlier aware network embedding for attributed networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 33: 12-19. Bandyopadhyay S, Vivek S V, Murty M N. Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding[C]//Proceedings of the 13th International Conference on Web Search and Data Mining. 2020: 25-33.
1. Background
Great progress has been made in the study of embedding on nodes in the existing property graph. Deep learning is often used to achieve great breakthroughs in the prevalent GCN. However, if there are anomalies in the property graph itself, these anomalies will seriously affect the final embedding learning results. This paper is to learn the representation of the sensing of the anomaly points on the property graph.
2. Exception definition
First of all, this paper defines what is an exception (I think this step is the key to the follow-up development of the paper. Only after defining a reasonable exception can we have further work).
The same color belongs to a community. Normally a Commnuity community is more similar in attributes and structure. Exception A (structure exception) : belongs to the red community (normal) and is structurally connected to nodes of other communities. Exception B (attribute exception) : The node belongs to the red community structurally (normal), but it is similar to the attributes of other communities. That is to say, in terms of attributes alone, this node does not fully follow the red community pattern. Exception C (comprehensive exception) : The node belongs to the green community and belongs to the blue community. That is, there are exceptions in both attributes and structures.
3. Problem definition
G = (V,E,C), where V = {v1,v2,···,vN}, A is the adjacency matrix (N_N),C is the content matrix (N_K), A is A common problem of the research problem on the graph A is very sparse, the problem of this paper is to learn good vector representation at the same time to do anomaly detection work. (Some anomaly detection methods cannot change the model for sparsity) finally obtain the low-dimensional representation of nodes. This problem is an exception detection problem in unsupervised scenarios (without any exception label information). Both content and structural anomaly detection are based on reconstruction losses.
4 models
4.1 Learning from the Link Structure
A is the adjacency matrix. G is the lower dimensional representation, which is what we want to learn eventually. H is the transformation matrix. The motivation is this: currently representing the implicit vector learned, the loss function is based on the reconstruction loss, which is the square term above. However, this kind of learning can only remember most of the patterns, and the definition of outliers itself is “differ significant from others”, and outliers may make a great contribution to the loss. Therefore, the author’s idea is to let outliers make a small contribution to the final loss. In extreme cases we erase these anomalies, isn’t that a good representation of learning?
4.2 Learning from the Attributes
Same idea
4.3 Connecting Structure and Attributes The above is a separate consideration of Structure or Attributes. I understand that the above 2 loss functions can produce good results for a and B exceptions respectively. For the third type of exception, joint exception. If you look at the attributes alone, there are no exceptions. If you look at the structure alone, there are no exceptions. At the same time, there is an anomaly. Therefore, structural features and attribute features are constrained because they are different representations of a node.
G and U might not align. Embedding Transformation and Procrustes problem multiplied by a constrained orthogonal matrix for feature alignment. This is the Procrustes problem. In my personal understanding, it is ok to do not constrain W, and there is no closed solution
4.4 Combined loss
4.5 Update Fixed other parameters, one at a time. 4.6g, H, U, V G
Take the derivative equals 0, and you’ll get the basics of higher algebra. Other similar
4.7 update the W
Converted to Procrustes problem, directly closed solution. 4.8 Update O converts to Lagrange problems with equality constraints.
Advanced algebra knowledge solving ~ 4.9 algorithm flow: ONE
5. Experiment !!!!
Initialization: G and U are represented by matrix factorization. The data set
Method of setting exceptions (injection exceptions) : 1. Calculate the class distribution of each node 2. 3. Plant a node in this type of node so that the edge of this node (M +10%) is connected to the other type. M is the average degree for this category. 4. The content is semantically the same as this category. The content of the structural outlier node is made semantically consistent with the keywords sampled from the Nodes of The selected class) results of the downstream task: anomaly Detection task, Node classification, Node Clustering Outlier Detection, Node Classification Node Clustering
The AANE algorithm is sometimes the equivalent
SEANO is a semi-supervisor, but not as good as this article. But AANE did better in the Citeseer dataset. We’re going to look at this algorithm.
6 summary and contribution:!
1. Embedding learning of the first abnormal point perception. 2. Procrustes problem is used to solve the alignment problem. 3. The experiment performed well
7. Next, the author continued to do some improved work on WSDM.
Two algorithms, Done and Adone, are proposed respectively. The contribution is 1. G and U are replaced by a hidden vector of AE. 2. Replace Procrustes problem with a discriminator, and align the hidden vector dimensions generated by the two AE by minimizing the loss of the discriminator! ! ! Very clever!!!
Experiment: Abnormal detection task:
Dominant effect is very good, is a semi-supervised method, later to see ~
Adone is fine when there are exceptions ~ node classification
Because the author puts forward the adversarial module on the basis of DONE, the validity of adversial module is verified!
Focus on machine learning and risk control algorithms, share data mining contests, machine learning, anti-cheating, recommendation systems and other technical articles. Come and study with me!