1. Naive Bayesian model
Naive Bayes method is a classification method based on Bayes theorem and independent assumption of feature conditions. In the prediction, y with the maximum posteriori probability corresponding to input X is found as the prediction.
NB model:
- Input:
A priori probability distribution: P (Y = ck), k = 1, 2,…, KP \ left (Y = c_ {k} \ right), \ quad k = 1, 2, \ \ cdots, KP (Y = ck), k = 1, 2,…, k condition probability distribution: P (X = X ∣ Y = ck) = P (X = X (1) (1),…, X (n) = X (n) ∣ Y = (ck), k = 1, 2,…, KP \ left (X = X | Y = c_ {k} \ right) = P \ left (X ^ {} (1) = X ^ {(1)}, \ \ cdots, X ^ ^ {(n)} = X | Y = {(n)} c_ {k} \ right), \ quad k = 1, 2, \ \ cdots, KP (X = X ∣ Y = ck) = P (X = X (1) (1),…, X (n) = X (n) ∣ Y = (ck), k = 1, 2,…, k, Input data X dimension is NNN.
- Output: Posterior probability of test data
According to posterior = likelihood ∗ prior/normalized posterior = likelihood * prior/normalized posterior = likelihood ∗ prior/normalized posterior = likelihood ∗ prior/normalized: P (Y = ck ∣ X = X) = P (X = X ∣ Y = ck) P (Y = ck) ∑ kP (X = X ∣ Y = ck) P (Y = ck) P \ left (Y = c_ {k} \ | X = X right) = P \ \ frac {left (X = X | Y = c_ {k} \ right) P\left(Y=c_{k}\right)}{\sum_{k} P\left(X=x | Y=c_{k}\right) P \ left (Y = c_ {k} \ right)} P (Y = ck ∣ X = X) = ∑ kP (X = X ∣ Y = ck) P (Y = ck) P (X = X ∣ Y = ck) P (Y = ck)
NB classifier is: Y = f (x) = arg Max ckP (y = ck) ∏ jP (x (j) = x (j) ∣ y = ck) ∑ kP (y = ck) ∏ jP (x (j) = x (j) ∣ y = ck) y = f (x) =, arg, Max _ {c_ {k}} \frac{P\left(Y=c_{k}\right) \prod_{j} P\left(X^{(j)}=x^{(j)} | Y=c_{k}\right)}{\sum_{k} P\left(Y=c_{k}\right) \prod_{j} P \ left (X ^ {} (j) = X ^ {} (j) | Y = c_ {k} \ right)} Y = f (X) = argmaxck ∑ kP (Y = ck) ∏ jP (X (j) = X (j) ∣ Y = ck) P (Y = ck) ∏ jP (X (j) = X (j) ∣ Y = ck)
Where, the denominator is a normalized factor, which can be ignored. Naive Bayes can be divided into Gaussian naive Bayes, polynomial naive Bayes, Bernoulli naive Bayes and so on.
Parameter estimation of naive Bayes
Naive bayes need to estimate the prior probability P (Y = ck) P \ left (Y = c_ {k} \ right) P (Y = (ck) and conditional probability P (X (j) = X (j) ∣ Y = ck) P \ left (X ^ ^ {} (j) = X | {(j)} Y = c_ {k} \ right) P (X (j) = X (j) ∣ Y = ck). Only the discrete attribute case is considered below.
2.1 Maximum likelihood Method (MLE)
Maximum Likehood Estimation is used to estimate prior probability: P (Y = ck) = ∑ I = 1 ni (yi = ck) N, k = 1, 2,…, KP \ left (Y = c_ {k} \ right) = \ frac {\ sum_ {I = 1} ^ {N} \ I left (y_ {I} = c_ {k} \ right)} {N}. \ quad k = 1, 2, \ \ cdots, KP (Y = ck) = N ∑ I = 1 ni yi = (ck), k = 1, 2,…, k conditional probability: