Reinforcement Learning Digest Part 3: SARSA & Q-learning

Ahmed El-Khouly
3 min readNov 22, 2020

In the last article I have explained generalized policy iteration process and described our first reinforcement learning algorithm: Mote Carlo. In this article we will discuss the drawbacks of Monte Carlo and explore two other algorithms that can help the agent overcome shortcomings of Monte Carlo.

Monte Carlo algorithm learns from complete episodes. This can have the following drwabacks:

  • Monte Carlo cannot be used for continuous tasks.
  • Monte Carlo can be very slow for environments…