First, train the main cycle
The training main loop is closely related to Deep Rl algorithm, and the main loop of this project is as follows:
Def train(self,max_episodes): print(self,max_episodes): print(self,max_episodes) ') step_counter = 0 for episode in range(max_episodes): Action0 = np.array([1, 0]) Done = self.env.frame_step(action0) state = self.get_init_state(x_RGB) Choose_action (state) = self.agent. Choose_action (state) State_ = self.get_next_state(state, x_RGB) # Preprocess subsequent states Self. Agent. Store_in_memory (state, action, reward, state_, done) # learning experience this turn (s, a, r, s) if step_counter > OBSERVE: Self.agent. Learn () # state = state_ step_counter += 1 print('\nCopy the code
Two, start the code
Its trainer startup code:
if __name__=='__main__': Flappy_Bird_Env() # initagent = agent.dqn (n_actions=2, Output_graph = False, save_model = True, read_saved_model = True E_greedy_increment =0.0001,# let greedy change, set to None, Trainer = FlappyBird_Trainer(env=env,agent=agent) # train(max_episodes=1)Copy the code
[Small game development]