Definition 1.
Markov chain: Process inThe state at time the condition and the process at timeIt doesn’t matter what the state is. The future has nothing to do with the past when the present is known.
Case 2.
2.1 Model Preparation
Bob and Alice are good friends. Bob’s mood is related to the weather. If the weather is sunny, it is marked as S; Bob is generally in happy state and marked as H; if the weather is rain, it is marked as R; Bob’s mood is generally in grumpy state and marked as G. Alice, a very careful and observant girl, collected the weather conditions in the past 14 days and Bob’s mood in the past 15 days.
Day 1 weather | The next day weather | The number of | The proportion |
---|---|---|---|
sunny | sunny | 8 | 0.8 |
sunny | rain | 2 | 0.2 |
rain | sunny | 2 | 0.4 |
rain | rain | 3 | 0.6 |
The number of state transitions in the statistics chart:
The weather | mood | The number of | The proportion |
---|---|---|---|
sunny | happy | 8 | 0.8 |
sunny | grumpy | 2 | 0.2 |
rain | happy | 2 | 0.4 |
rain | grumpy | 3 | 0.6 |
The graph below is drawn.
The probabilities in Figure 3-1 are called transition probabilities
The probability values in Figure 3-2 are calledemission probabilities
2.2 Consider three questions:
Question 1: If Alice receives no information from Bob on a random day, will it be sunny or rainy?
In [FIG. 2. Correspondence between Bob’s mood and weather], there are ten days of sunny days and five days of rainy days. In the case that Bob has no information, the proportion of sunny days is, the proportion of rainy days is. So the answer to the first question is yesIs likely to be sunny,The likelihood is rainy.
Question 2: If Bob is happy today, what is the probability that it will be sunny or rainy?
This is actually a Bayesian problem: given.......
owithThe probability.
Question 3: What is Bob’s moodHappy-Grumpy-HappyWhat was the weather like on the last day
Three days in a rowMay:
- Sunny – Sunny – Sunny
- Sunny – Sunny – Rain
- Sunny – Rain – Sunny
- Sunny – Rain – Rain
- Rain – Rain – Rain
- Rain – Rain – Sunny
- Rain – Sunny – Rain
- Rain – Sunny – Sunny
Take case 3:
So let’s do this for each of the 8 cases, pick the one with the highest probability
2.3 Viterbi algorithm:
If Bob is in a happy-happy-surge-surge-happy mood and asks about the weather on the last day?
2.3.1 ideas
Six days, so there’s a totalPossibilities, every day that goes by, the possibilities that are considered go up exponentially, and in this case, you need helpDynamic programmingThoughts.
-
Step1: The last day may be Rain or Sunny. Therefore, it is necessary to calculate Bob’s Happy state under Sunny condition and Happy state under Rain condition respectively to compare the probability.
-
Consider that the last day is Sunny, the penultimate day is the probability that Sunny makes Bob feel Grumpy, and the penultimate day is Rain makes Bob feel Grumpy, and select the value with the highest probability. Do the same for Rain on the penultimate day.
-
Repeat Step1 and Step2 until the state is complete.
Happy | Grumpy | Happy |
---|---|---|
Sunny: | Sunny : | Sunny : |
Rain : | Rain | Rain : |
2.3.2 Case Code (Viterbi Algorithm)
# According to the definition of transition probability between weather conditions
Weather2Weather_Prob = {
'sunny': {
'sunny': 0.8.'rain': 0.2},'rain': {
'sunny': 0.4.'rain': 0.6}}# Define emission probability
Mood2Weather_Prob = {
'happy': {
'sunny': 0.8.'rain': 0.4},'grumpy': {
'sunny': 0.2.'rain': 0.6}} weather_state = ['sunny'.'rain']
mood_state = ['happy'.'grumpy']
Initialize the probability of the two weather types
sunny_init_prob = 2/3
rain_init_prob = 1/3
def predictMood(BobMood):
weather = {}
weather_prob = {}
# Define the probability matrix
for iday in range(len(BobMood)+1):
weather_prob[iday] = {
'sunny': 0.'rain': 0
}
weather_prob[0] = {
'sunny': sunny_init_prob,
'rain': rain_init_prob
}
# Calculate the probabilities for each day based on Bob's mood
for iday, iday_mood in enumerate(BobMood):
# Consider the number of current weather states
for icurrent_weather_state in weather_state:
weather_prob[iday][icurrent_weather_state] *= Mood2Weather_Prob[iday_mood][icurrent_weather_state]
# Consider the probability after the state transition
for ifuture_weather_state in weather_state:
weather_prob[iday+1][ifuture_weather_state] = max(
weather_prob[iday][icurrent_weather_state] * Weather2Weather_Prob[icurrent_weather_state][ifuture_weather_state],
weather_prob[iday + 1][ifuture_weather_state]
)
weather[iday] = 'sunny' if weather_prob[iday]['sunny'] > weather_prob[iday]['rain'] else 'rain'
return weather, weather_prob
if __name__ == '__main__':
BobMood = ['happy'.'happy'.'grumpy'.'grumpy'.'grumpy'.'happy']
weather, weather_prob = predictMood(BobMood)
print('weather is:')
print(weather)
print('probabilities are:')
print(weather_prob)
Copy the code
3. Hidden Markov’s scientific derivation
3.1 Basic Concepts
-
State sequence: corresponds to the one in the figure above
-
Observation series: corresponding to the above figure
-
A moment: Each position in the sequence corresponds to the one in the figure above
In ** “If Bob’s mood is” happy-happy-surge-surge-happy “, ask about the weather on the last day? Chinese: The observation sequence is Bob’s mood, and the state sequence is the weather.
-
Formal definition of hidden Markov model:
- Corresponds to the example aboveThe weather.Corresponds to the example abovemood..
- : 15-day weather series,: Sequence of Bob’s moods over the course of 15 days.
- State transition matrix(Can be listed according to Figure 3)
- Observation probability matrix(Can be listed according to Figure 4)
- Initial state probability vector(Question 1) : Yes..
3.2 The premise of hidden Markov
4. Markov’s application
The example of Alice predicting the weather based on Bob’s mood is problem 3 — prediction problem.
4.1 Probability calculation problem
Through a simple example:
4.1.1 Forward algorithm
It is divided into the following steps:
1. ByObservation probability matrixCalculate the first oneThe observed dataFrom differentstateForward probability of.
inAt time zero, the probability of a red ball is zero
box at | probability |
---|---|
1 (box 1 is the probability of a red ball) | |
2 (the probability that box 2 is a red ball) | |
3 (the probability that box 3 is a red ball) |
2. ByTransition probability matrixCompute to generate the next onestateProbability of passing againObservation probability matrixCompute to generate the next oneobservationsThe probability of
inAt the moment, the probability of three boxes shifting to different boxes and the probability of generating white balls are shown in the table below.
box at | trans_probability |
---|---|
1 (probability of three boxes of red balls converting to Box 1) | |
2 (probability of three boxes of red balls converting to box 2) | |
3 (probability of three boxes of red balls converting to box 3) |
box at | probability |
---|---|
1 (the probability of box 1 producing a red ball) | |
2 (the probability of box 2 producing a red ball) | |
3 (the probability of box 3 producing a red ball) |
3. Repeat Step 2 until theThe observation sequenceTermination.
4. Add up the final probabilities.
The complete process is as follows:
4.1.2 Backward calculation
Think backwards:
On behalf of,At time, in the case of box 1 being a red ball, the observation sequence is: the probability of white, red 2 (the second red ball).
1. Last oneThe observed dataIt’s already identified, so initialize threestateThe probability of theta is 1.
inmoment
box at | probability |
---|---|
1 (the probability that the observation sequence is [empty] under the condition that box 1 is a red sphere) | |
2 (the probability that the observation sequence is [empty] under the condition that box 2 is a red sphere) | |
3 (The probability that the observation sequence is [empty] under the condition that box 3 is a red sphere) |
throughObservation probability matrixTo calculateDesignated observation data(another way of reading the above table)
box at | color_probability |
---|---|
1 (Pick a ball from box 1 and observe the probability that the sequence is empty, i.e., the probability of picking a red ball) | |
2 (Pick a ball from box 2 and observe the probability that the sequence is empty, i.e., the probability of picking a red ball) | |
3 (Pick a ball from box 3 and observe the probability that the sequence is empty, i.e., the probability of picking a red ball) |
2. Then the calculation shifts to differentstateThe probability.
inmoment
box at | probability |
---|---|
1 (inTime, for box 1, needs to be consideredThe probability of box 1 moving to box 1,2,3) | |
2 (inTime, for box 2, is something to considerThe probability of box 2 moving to box 1,2,3) | |
3 (inTime, for box 3, needs to be consideredThe probability of box 3 moving to box 1,2,3) |
The meaning of the probability and of each box is: the probability of the observation sequence being [red ball] under the condition that the box is white ball
3. Repeat Step 2 until the observation sequence terminates.
4. In the last step, replace the transition probability with the initial probability and sum it up.
4.1.3 Relationship between the probabilities of the preceding and the following terms
Given model parametersAnd observationIn the momentIn a state ofThe probability of is denoted as:
4.2 Learning Algorithm
In the learning algorithm, it is divided into supervised learning and unsupervised learning.
4.2.1 Supervised learning
The basic concept
When the training set includes observation sequence and state sequence, supervised learning method can be used to estimate parameter values.
-
Estimation of state transition matrix:
-
Estimation of observation probability matrix:
-
Estimation of initial state probability:
Concrete example
Examples of Bob’s mood and weather under “formal Definition of Hidden Markov Model” in section 3, “Scientific Derivation of Hidden Markov model”, “3.1 Basic Concepts”.
4.2.2 Unsupervised learning
When the training set only includesThe observation sequence, the goal is to learn to estimate Markov parameters (state transition matrix, observation probability matrix and initial probability). At this point, the observation sequence is regarded as observation dataState sequences are considered as implicit variablesWith the help ofEM modelWe can solve it.
4.3 Prediction Algorithm
4.3.1 Viterbi algorithm
See Section 2.3: Viterbi algorithm for specific examples
5. References
- Statistical Learning Methods
- Refer to the address