Create Multi-Armed Bandit Algorithms In Python

This course is designed for those interested to learn more about creating multi-armed bandit algorithms, model real business problems, and implement various algorithmic strategies. Read more.

Edward

No ratings yet

Course Skill Level

Advanced

Time Estimate

3h 45m

Access all courses in our library for only $9/month with All Access Pass

Get Started with All Access Pass Buy Only This Course

About This Course
Curriculum

About This Course

Who this course is for:

Anyone with basic Python skills desiring to start in Reinforcement Learning
Experienced AI Engineers, ML Engineers, Data Scientist, and Software Engineers wanting to apply Reinforcement Learning to real business problems
Business professionals willing to learn how Reinforcement Learning can help with automating adaptive decision making processes

What you’ll learn:

Understand and be able to identify Multi-Armed Bandit (MAB) problems
Model real business problems as MAB and implement digital AI agents to automate them
Understand the challenge of Reinforcement Learning regarding the exploration-exploitation dilemma
Practical implementation of the various algorithmic strategies for balancing between exploration and exploitation
Python implementation of the Epsilon-greedy strategy
Python implementation of the Softmax Exploration strategy
Python implementation of the Optimistic Initialization strategy
Python implementation of the Upper Confidence Bounds (UCB) strategy
Understand the challenges of Reinforcement Learning in terms of the design of reward functions and sample efficiency
Estimation of action values through incremental sampling

Requirements:

Be able to understand basic OOP programs in Python
Have basic Numpy and Matplotlib knowledge
Basic algebra skills

Software version used in the course:

Python 3.9.5.

With very concise explanations, this course teaches you how to confidently translate seemingly scary mathematical formulas into Python code painlessly. We understand that not many of us are technically adept in the subject of mathematics so this course intentionally stays away from maths unless it is necessary. And even when it becomes necessary to talk about mathematics, the approach taken in this course is such that anyone with basic algebra skills can understand and most importantly easily translate the maths into code and build useful intuitions in the process.

Some of the algorithmic strategies taught in this course are Epsilon Greedy, Softmax Exploration, Optimistic Initialization, Upper Confidence Bounds, and Thompson Sampling. With these tools under your belt, you are adequately equipped to readily build and deploy AI agents that can handle critical business operations under uncertainties.

Our Promise to You

By the end of this course, you will have learned to create multi-armed bandit algorithms.

10 Day Money Back Guarantee. If you are unsatisfied for any reason, simply contact us and we’ll give you a full refund. No questions asked.

Get started today and learn more about Python programming.

Course Curriculum

Section 1 - Introduction And Course Lessons
	Introduction To Reinforcement Learning And Multi-Armed Bandit Problems	00:00:00
	Implementing Simulated MAB Environments In Python	00:00:00
	Estimating Action Values Through Sampling	00:00:00
	Implementing Incremental Average In Code	00:00:00
	Implementing Incremental Average For Non-Stationary Bandits	00:00:00
	Building A Baseline Agent That Behaves Randomly	00:00:00
	Why Are The Results Not Repeatable?	00:00:00
	Implementing And Analysing A Greedy Agent	00:00:00
	Balancing Exploration And Exploitation With Epsilon Greedy Agents	00:00:00
	Controlling Exploration With A Decay	00:00:00
	Exploring Intelligently With Softmax Exploration	00:00:00
	Being Optimistic Under Uncertainties	00:00:00
	Realistic Optimism Under Uncertainties	00:00:00