Reinforcement Learning Digest Part 2: Bellman Equations, Generalized Policy Iteration and Monte Carlo

In the last article, I have introduced Reinforcement learning Markov Decision Process (MDP) framework, discounted expected rewards and value and policy functions definitions. In this article, we will continue the definition of the MDP framework explaining Bellman and Bellman optimality equations. Additionally we will have describe our first reinforcement learning algrithm: Monte Carlo. So let us start…

Bellman equations



Technical lead of IBM Cognos recommenders system

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store