MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.
This is a preliminary, non-stable release of Maze. It is not yet complete and not all of our interfaces have settled yet. Hence, there might be some breaking changes on our way towards the first stable release.
Below we list a few selected Maze features.
You can try Maze without prior installation! We provide a series of Getting started notebooks to help you get familiar with Maze. These notebooks can be viewed and executed in Google Colab - just pick any of the included notebooks and click on the button.
If you want to install Maze locally, make sure PyTorch is installed and then get the latest released version of Maze as follows:
pip install -U maze-rl # optionally install RLLib if you want to use it in combination with Maze pip install ray[rllib] tensorflow
Read more about other options like the installation of the latest development version.
⚡ We encourage you to start with Python 3.7, as many popular environments like Atari or Box2D can not easily be installed in newer Python environments. Maze itself supports newer Python versions, but for Python 3.9 you might have to install additional binary dependencies manually
Alternatively you can work with Maze in a container with pre-installed Jupyter lab: Run
docker run -p 8888:8888 enliteai/maze:playground and open
localhost:8888 in your browser. This loads Jupyter
To see Maze in action, check out a first example. Training and deploying your agent is as simple as can be:
from maze.api.run_context import RunContext from maze.core.wrappers.maze_gym_env_wrapper import GymMazeEnv rc = RunContext(env=lambda: GymMazeEnv('CartPole-v0'), algorithm="ppo") rc.train(n_epochs=50) # Run trained policy. env = GymMazeEnv('CartPole-v0') obs = env.reset() done = False while not done: action = rc.compute_action(obs) obs, reward, done, info = env.step(action)
Step by Step Tutorial
The documentation is the starting point to learn more about the underlying concepts, but most importantly also provides code snippets and minimum working examples to get you started quickly.
The Workflow section guides you through typical tasks in a RL project
Policy and Value Networks introduces you to the Perception Module, how to customize action spaces and the underlying action probability distributions and two styles of policy and value networks construction:
Learn more about core concepts and structures such as the Maze environment hierarchy, the Maze event system providing a convenient way to collect statistics and KPIs, enable flexible reward formulation and supporting offline analysis.
Structured Environments and Action Masking introduces you to a general concept, which can greatly improve the performance of the trained agents in practical RL problems.
We believe in Open Source principles and aim at transitioning Maze to a commercial Open Source project, releasing larger parts of the framework under a permissive license in the near future.