PPO Agent playing BipedalWalker-v3

This is a trained model of a PPO agent playing BipedalWalker-v3 using the stable-baselines3 library.

Usage (with Stable-baselines3)

TODO: Add your code

from stable_baselines3 import ...
from huggingface_sb3 import load_from_hub

...

This one was trained with tuned hyperparameters on 100 million timesteps and it still falls in the holes suggesting that we have reached the limits of basic PPO for this challenge

Downloads last month: 1

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on BipedalWalker-v3
self-reported

13.49 +/- 51.00

View on Papers With Code