The observation here means time-series data represented by a series of images. We stack these four images to form an “observation” of the bird. The input data for training is a continuous four-frame image. These are treated as the training data of the CNN. In each game action stage, we store the current state of the bird, the action the agent took, and the next state of the bird. The main RL architecture is Q-Learning, a Convolutional Neural Network (CNN). This project used a similar approach with DeepLearningFlappyBird, a Python Flappy Bird RL implementation. In this section, we will introduce some major algorithm and networks we used to help you better understand how we trained the model.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |