Reinforcement Learning Digest Part 1: Introduction & Finite Markov Decision Process framework

Ahmed El-Khouly
6 min readNov 21, 2020


Reinforcement learning is an important type of machine learning used in vast range of applications and fields including robotics, genetics, financial applications and recommendation systems to mention a few. In this series of articles, I aim at taking the reader into a journey to learn enough about this topic. The goal is to build knowledge in reinforcement learning starting from basic principles and gradually get to more advanced aspects of reinforcement learning. The articles will have a balance theory and practical demos which can help to practice theory learnt and cement understanding. So let us start the journey…


Reinforcement learning can be defined as follows:

”Reinforcement learning is an area of machine learning concerned with how software agents ought to take actionsin an environment in order to maximize some notion of cumulative reward.”

- Wikipedia

From this definition, we see that we have a software agent that interacts with an environment by taking actions which results in an immediate reward. it is the goal of the reinforcement learning is for the agent to learn how to maximize cumulative rewards obtained from taking sequence of such actions. One should note that actions with highest immediate rewards will result in optimal overall rewards. Therefore, the goal of reinforcement learning is to learn how to maximize overall rewards.

How reinforcement learning different from other types of machine learning?

Supervised machine learning algorithms receive labeled samples. The label can be class for classification tasks or numeric value for regression tasks. The aim is to learn provide labels for examples they did not see before. The input samples are independent from each other and during training they are sampled with equal probability. For unsupervised learning, the input is unlabeled samples and the aim is identify clusters or association within the sample population.

Reinforcement learning is different from both types of ML in the following ways:

· Input: input to reinforcement learning algorithm is a representation of state of the environment.

Ahmed El-Khouly

Technical lead of IBM Cognos recommenders system