Reinforcement Learning Digest Part 3: SARSA & Q-learning
In the last article I have explained generalized policy iteration process and described our first reinforcement learning algorithm: Mote Carlo. In this article we will discuss the drawbacks of Monte Carlo and explore two other algorithms that can help the agent overcome shortcomings of Monte Carlo.
Monte Carlo algorithm learns from complete episodes. This can have the following drwabacks:
- Monte Carlo cannot be used for continuous tasks.
- Monte Carlo can be very slow for environments…