Author: CHEONG
AI Machine Learning and Knowledge Graph
Research interests: Natural language processing and knowledge mapping
Introduction: This paper contains a large number of formulas, if you want to obtain the whole manuscript of this paper, scan code to pay attention to the public number [AI machine learning and knowledge graph], reply: Probability graph model lecture 2. You can add wechat [17865190919] into the public account discussion group, and add friends with remarks from nuggets. Original is not easy, reprint please inform and indicate the source!
Section in the previous chapter introduced the relatively independent assumption, homogeneous markov assumption and conditional independence assumption, a conclusion is: probability graph conditions independent characteristics, according to the probability of a build good figure can be directly obtained conditional independence between nodes, namely conditional independence between the random variable is set. Formulization for XA XB coming ∣ XCX_A \ bot X_B | X_CXA XB coming ∣ XC, XA, XB, XCX_A, X_B, X_CXA, XB, XC is a collection of nodes in the graph. This section introduces joint probability distributions and conditional independence of directed graphs.
1. Joint probability distribution of directed graphs
First, the joint probability distribution of random variables in a directed graph can be easily written by referring to the following formula, where Xpa(I)X_{pa(I)}Xpa(I) is the parent node of XiX_iXi:
2. Conditional independence of digraphs
Let’s first use three examples of digraphs to understand how digraphs derive conditional independence between random variables.
Example 1: As shown in the figure below, node A is the head node, and nodes B and C are tail nodes. The figure below is called tail-to-tail mode for convenience.
Give a conclusion: in the case of a given node a, b and c are independent of each other, that is: c b ∣ coming ac bot \ | b ac an b ∣, was derived under the prove the correctness of the conclusion.
First, according to the above figure, the corresponding joint probability distribution can be written as:
And the chain rule always holds:
It can be deduced according to the above two formulas:
Is derived according to the p (b, c ∣ a) = p (c ∣ a) p (b ∣ a) p (b, c | a) = p (c | a) p (b | a) p (b, c ∣ a) = p (c ∣ a) p (b ∣ a) can be directly properties of conditional independence: C b ∣ coming ac bot \ | b ac an b ∣, namely in the case of a given b and c are independent of each other. So the next time tail shown above – to – tail mode directed graph can be directly concluded c b ∣ coming ac bot \ | b ac b ∣ a coming.
Example 2: The following directed graph, where A is the head node and B and C are both tail nodes, is called head-to-tail mode for convenience:
Give a conclusion: under the condition of a given node b, a and c were independent of each other, that is, a coming c ∣ ba bot c | ba an ∣ c b, prove that way is the same as the example 1, first of all, can write the joint probability distribution of corresponding as follows:
In combination with the chain rule, it can be deduced that in the same way as example 1: P (a, c ∣ b) = p (c ∣ b) p (a ∣ b) p (a, c | b) = p (c | b) p (a | b) p (a, c ∣ b) = p (c ∣ b) p (a ∣ b), therefore can safely draw the conclusion that under the condition of the observed b, a and c are independent of each other, that is, a coming c ∣ ba bot c | ba an ∣ b c.
Example 3: The third case is relatively special, as shown in the following directed graph, nodes A and B are head nodes, and node C is tail node. The figure below is called head-to-head mode for convenience:
The first conclusion is drawn: by default, nodes A and B are independent of each other, while when node C is observed, a and B are related to each other. We can think of it this way: A and B are husband and wife, c is a child. Before there is no child C, A and B do not know each other and are independent. When they have child C, A and B are no longer independent. So let’s prove it by derivation.
First, the joint probability distribution corresponding to the above figure can be written as:
Combination according to the chain rule are: p (a, b, c) = p (a) p (b ∣ a) p (c ∣ a, b) p (a, b, c) = p (a) p (b | a) p (c | a, b) p (a, b, c) = p (a) p (b ∣ a) p (c ∣ a, b), according to the above two formulas can be obtained: P (b) = p (b ∣ a) p (b) = p (b | a) p (b) = p (b ∣ a), so it can be seen that by default, a and b is truly independent of each other.
Extended knowledge: If all successor nodes of C are observed, nodes A and B will also be related and independent.
Then the d-division of the directed graph is introduced. The d-division method can obtain the set of random variables independent of each other in the directed graph. The d-division rule is shown below.
3. D- division of directed graphs
The two core rules of D-seperation, d-partition rule, are also known as global Markov. If in the following directed graph, the set XAX_AXA and the set XCX_CXC are independent of each other in the case that the set XBX_BXB is observed, the following two items must be satisfied:
Rule 1: Node A belongs to set XAX_AXA, and node C belongs to set XCX_CXC, as shown in the following figure. If there is head-to-tail mode between nodes B1b_1B1 and a and C as described above, node B1b_1B1 must be in set XBX_BXB. Similarly, if node b2b_2B2 and node A and c meet the tail-to-tail mode described above, node b2b_2B2 must be in the set XBX_BXB.
** Rule 2: ** If the existing nodes B ∗b_*b∗ and nodes A and C meet the head-to-head mode, node B ∗b_*b∗ must be outside the set XBX_BXB,
And the successors of node B ∗b_*b∗ must also be outside the set XBX_BXB.
At the same time, according to the above two D-Seperation rules, we can also find the set satisfying the conditional independence in the directed graph.
In the next chapter, conditional independence and factorization of undirected graphs in probability graphs are introduced.
Machine learning 【 Whiteboard derivation series 】 author: Shuhuai008
Reference books data: the Pattern Recognition and Machine Learning author: [Christopher Bishop] (book.douban.com/search/Chri… Bishop)