preface
In the era of big data explosion, everyone will receive a large number of spam messages every day. Since it is difficult to identify spam messages with traditional judgment methods, bayes can accurately identify spam messages.
Bayes formula
Bayesian formula derivation
because
Bayesian inference
Posterior probability = prior probability ✖️ adjustment factor
That’s what bayesian inference means. We first estimate a prior probability, and then add the experimental results to see whether the experiment enhances or weakens the prior probability, so as to obtain a posterior probability closer to the fact.
Spotting spam
Assuming that
Event A1: junk mail event A2: normal mail event B: mail event containing the word invoice
To beg
Contains the word invoice email is spam probability P (A1 | B) contains the word invoice mail is normal probability that P (A2) | B
conclusion
If P (A1) | B – P (A2) | B > 0 is the word that contains the invoice email spam probability is larger than the normal mail Therefore judge it as spam, instead of normal email.
Because P (A1) | B – P (A2) | B equivalent to the P (B | A1) P (A1) – P (B | A2) P (A2) so by P (B | A1) P (A1) – P (B | A2) P (A2) can determine the current email is spam or not
Among them
P(A1) : probability of spam
P(A2) : is the probability of normal mail
P (B | A1) : for the probability of spam contained in the invoice
P (B | A2) : the probability of as normal mail contained in the invoice
Bayesian application
P(A1) : probability of spam
P(A2) : is the probability of normal mail
P (B | A1) : for the probability of spam contained in the invoice
P (B | A2) : the probability of as normal mail contained in the invoice
Why bayes
P (A1) | B: contain invoice email is spam probability is unable to count () P (B | A1) : the probability of the word spam contained in the invoice (statistics)
Through the bayesian statistics we can not P (A1) | B into statistical P (B | A1).
The results
Project introduction
Because I like Swift, the programming language, all the codes of this project are implemented by Swift.
Please give it a thumbs up if you like it! thank you
Making the address