imitation

Clean PyTorch implementations of imitation learning algorithms

Showing:

Popularity

Downloads/wk

0

GitHub Stars

313

Maintenance

Last Commit

2d ago

Contributors

10

Package

Dependencies

0

License

MIT

Categories

Readme

CircleCI Documentation Status codecov PyPI version

Imitation Learning Baseline Implementations

This project aims to provide clean implementations of imitation learning algorithms. Currently we have implementations of Behavioral Cloning, DAgger (with synthetic examples), Adversarial Inverse Reinforcement Learning, and Generative Adversarial Imitation Learning.

Installation:

Installing PyPI release

pip install imitation

Install latest commit

git clone http://github.com/HumanCompatibleAI/imitation
cd imitation
pip install -e .

Optional Mujoco Dependency:

Follow instructions to install mujoco_py v1.5 here.

CLI Quickstart:

We provide several CLI scripts as a front-end to the algorithms implemented in imitation. These use Sacred for configuration and replicability.

From examples/quickstart.sh:

# Train PPO agent on pendulum and collect expert demonstrations. Tensorboard logs saved in `quickstart/rl/`
python -m imitation.scripts.expert_demos with pendulum fast log_dir=quickstart/rl/

# Train GAIL from demonstrations. Tensorboard logs saved in output/ (default log directory).
python -m imitation.scripts.train_adversarial with gail pendulum fast rollout_path=quickstart/rl/rollouts/final.pkl

# Train AIRL from demonstrations. Tensorboard logs saved in output/ (default log directory).
python -m imitation.scripts.train_adversarial with airl pendulum fast rollout_path=quickstart/rl/rollouts/final.pkl

Tips:

  • Remove the "fast" option from the commands above to allow training run to completion.
  • python -m imitation.scripts.expert_demos print_config will list Sacred script options. These configuration options are documented in each script's docstrings.

For more information on how to configure Sacred CLI options, see the Sacred docs.

Python Interface Quickstart:

See examples/quickstart.py for an example script that loads CartPole-v1 demonstrations and trains BC, GAIL, and AIRL models on that data.

Density reward baseline

We also implement a density-based reward baseline. You can find an example notebook here.

Citations (BibTeX)

@misc{wang2020imitation,
  author = {Wang, Steven and Toyer, Sam and Gleave, Adam and Emmons, Scott},
  title = {The {\tt imitation} Library for Imitation Learning and Inverse Reinforcement Learning},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/HumanCompatibleAI/imitation}},
}

Contributing

See CONTRIBUTING.md.

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100