“Lessons in life are repeated until learned.” – Unknown
The quote above represents a fundamental capability that comes naturally to humans. Learning through experience. Through trial and error. That is how babies learn, and though we may not recognize the importance of this skill, it is a foundational human skill.
And those who try to teach this skill to Machines (algorithms) can validate the importance of this skill. What comes naturally to us needs to be trained extensively in machines.
Reinforcement learning gets its name because the algorithm’s behavior is reinforced based on a reward system. That behavior is reinforced if a specific action leads to a positive outcome (reward), much like Pavlov’s dog.
Examples of potential applications range from financial trading to tactical supply chain operations decision-making, like assigning inbound trailers to dock doors. The criteria defined above essentially span a vast majority of day-to-day business processes.
Even though these algorithms can mimic one of the foundational human behaviors, their use outside academia, gaming, and robotics has been limited. Expressly, their use in industry has been limited. And in my opinion, the possibilities are vast.
To begin with, let us try to develop a generic criterion for activities where reinforcement learning can be leveraged within business processes. The criteria are:
- Decisions need to be made in sequence
- The impact of the decision needs to be evaluated and may impact the next decision
- There is/are critical objectives that need to be kept in mind when making the decisions.
Why has the adoption of reinforcement learning been slow, despite being used as a mature technology in many areas? There are a few reasons:
- It is a continuous, repeated action-based algorithm. Unlike widely used supervised and unsupervised algorithms that help us with one-time insights or strategic decisions (like will this customer churn or what drives house prices the most), reinforcement learning focuses on repeated tasks and, in all probability, daily tasks performed by humans. Trusting an algorithm to make critical repeated decisions does not lie in the comfort zone of many executives.
- Also, unlike commonly used ML algorithms, these algorithms may take much longer to train. They are also not as easy to understand regarding how they learn and decide on a linear regression model. Also, training these models to a state where they can be deployed in production takes much longer than other models we have discussed in this and previous paragraphs.
However, if you are an executive open to tackling challenges beyond the vanilla, reinforcement learning can help you become a digital transformation rockstar. Some pointers:
- Identify candidate processes that fit the criteria above if you are a warehousing executive and opportunities galore. As a life sciences research leader, maybe reinforcement learning can help expedite the molecule development process for drug development.
- Be patient and ditch the conventional data science pilot approach. Reinforcement learning algorithms need more extended training and business validations than other data science initiatives, like deploying a churn detection algorithm.

