metadata
library_name: stable-baselines3
tags:
- BipedalWalker-v3
- deep-reinforcement-learning
- reinforcement-learning
- stable-baselines3
model-index:
- name: PPO
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: BipedalWalker-v3
type: BipedalWalker-v3
metrics:
- type: mean_reward
value: 13.49 +/- 51.00
name: mean_reward
verified: false
PPO Agent playing BipedalWalker-v3
This is a trained model of a PPO agent playing BipedalWalker-v3 using the stable-baselines3 library.
Usage (with Stable-baselines3)
TODO: Add your code
from stable_baselines3 import ...
from huggingface_sb3 import load_from_hub
...
This one was trained with tuned hyperparameters on 100 million timesteps and it still falls in the holes suggesting that we have reached the limits of basic PPO for this challenge