Implementing Deepmind's MuZero Algorithm with Python
Deepmind has achieved a huge milestone by publishing its latest paper around Reinforcement Learning in Nature - 23/DEC/2020. How they were able to train a Reinforcement Learning algorithm that masters Go, Chess, Shogi and Atari without needing to be told the rules.
Setting up Stable Baselines with Python 3.7 and WSL 2
I've been playing around a bit with Stable Baselines lately for an upcoming project. However while playing, I encountered some issues seeing that Stable Baselines currently requires Tensorflow <= 1.14 which is only supported through Python < 3.8. Due those difficulties, I thought it would be interesting to share how I set up my personal development environment from scratch to play around with Stable Baselines.
Autonomously Landing a Lunar Lander with an Xbox Controller Robotic Arm - Part 2
In Part 1 I explained you how you can create a Robot Arm that can be mounted on an Xbox controller. In OpenAI Lunar Lander I then continued to see how we are able to train a lunar lander environment to land a lunar lander all by itself with a continuous action space.
Training the Continuous Lunar Lander with Reinforcement Learning, RLLib and PPO
For an upcoming blog post, I would like to have a robotic arm to land a Lunar Lander autonomously. In Part 1 I explained how we can build such a robotic arm already, but now we need to be able to go deeper into how we are able to train an environment in a simulation environment (before deploying it on a physical device).
Autonomously Landing a Lunar Lander with an Xbox Controller Robotic Arm - Part 1
Creating an Xbox Robot Arm is something that I've been wanting to do ever since I saw the post by Kevin Drouglazet who was able to utilize an Xbox Controller with an arm he created and was so friendly to publish the design files for (I am definitely not a hardware designer 😅) - thanks for that Kevin!
Reinforcement Learning with the Bonsai Platform
The Bonsai Machine Teaching platform has been released! Promising an easy to use environment for end-to-end Reinforcement Learning projects, starting with simulator selection / integration to algorithm configuration and training.
Roadwork RL - A Multi-Language Reinforcement Learning Environment
After explaining the end-to-end concept of creating a Digital Twin Reinforcement Learning environment I wanted to go into a deeper explanation of how the first part of this can be done.
A Multi-Language Reinforcement Learning Digital Twin Environment
One of the ideas I have been playing around with the last couple of months is the combination of Digital Twins and Reinforcement Learning. This is an experimental idea where I would love to hear your opinions about it (feel free to comment below, send me an email or reach out to me on Social Media such as Twitter / LinkedIn), and that will be refined over the coming months.
Facebook ReAgent - An End-to-End Use Case
Facebook decided to release their end-to-end applied reinforcement learning platform called ReAgent, after reading their vision on this, I have to say that I am completely hooked! They are providing an excellent view of Reinforcement Learning and the future adoption of it. But why is this and how can we get started with it?
Facebook's Open-Source Reinforcement Learning Platform - A Deep Dive
Facebook decided to open-source the platform that they created to solve end-to-end Reinforcement Learning problems at the scale they are working on. So of course I just had to try this ;) Let's go through this together on how they installed it and what you should do to get this working yourself.
Writing a C# SDK for the OpenAI Gym using .NET Core
When we take a look at the OpenAI Gym on Github (https://github.com/openai/gym-http-api), we see that it does not have bindings available for C#. Now since I am a firm believer of .NET Core and what it brings to developer ecosystem, I decided to write one myself (https://github.com/Xaviergeerinck/dotnetcore-sdk-openai). Using what I learned in my previous blog post How to write a SDK in dotnet Core I created one that looks like this for the main method:
OpenAI Gym Problems - Solving the CartPole Gym
Dividing numbers into equal buckets or bins through Bucketization
A common practice in Reinforcement Learning is to go from a continuous space towards a discrete space. What does this mean? Take for example a range of numbers between $]-3, 3[$, if we want to represent all the numbers that fit in this range as a state then we would have $\infty$ numbers being represented or $\infty$ states.
The Markov Property, Chain, Reward Process and Decision Process
As seen in the previous article, we now know the general concept of Reinforcement Learning. But how do we actually get towards solving our third challenge: "Temporal Credit Assignment"?
Installing OpenAI Gym in a Windows Environment
Reinforcement learning does not only requires a lot of knowledge about the subject to get started, it also requires a lot of tools to help you test your ideas. Since this process is quite lengthy and hard, OpenAI helped us with this. By creating something called the OpenAI Gym, they allow you to get started developing and comparing reinforcement learning algorithms in an easy to use way.
Multi-armed bandit framework
To start solving the problem of exploration, we are going to introduce the Multi-armed bandits framework. But what exactly does this solve? Just think that you are executing a clinical trial with 4 pills. You know that the pills have a survival rate but you don't know what that survival rate is. Your goal: find the pill with the highest chance of survival in X trials.
An introduction to Reinforcement Learning (RL)
So as we learned in the intro to Machine Learning, Reinforcement Learning is this technique where we have an agent who will take specific actions on an environment to try to reach an optimal state. But how can we illustrate this? Take a look at the following picture.