I want to play with the OpenAI gyms in a notebook, with the gym being rendered inline.
Here\'s a basic example:
import matplotlib.pyplot as plt
impor
I made a working example here that you can fork: https://kyso.io/eoin/openai-gym-jupyter with two examples of rendering in Jupyter - one as an mp4, and another as a realtime gif.
The .mp4 example is quite simple.
import gym
from gym import wrappers
env = gym.make('SpaceInvaders-v0')
env = wrappers.Monitor(env, "./gym-results", force=True)
env.reset()
for _ in range(1000):
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done: break
env.close()
Then in a new cell
import io
import base64
from IPython.display import HTML
video = io.open('./gym-results/openaigym.video.%s.video000000.mp4' % env.file_infix, 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''
<video width="360" height="auto" alt="test" controls><source src="data:video/mp4;base64,{0}" type="video/mp4" /></video>'''
.format(encoded.decode('ascii')))
This worked for me in Ubuntu 18.04 LTS, to render gym locally. But, I believe it will work even in remote Jupyter Notebook servers.
First, run the following installations in Terminal:
pip install gym
python -m pip install pyvirtualdisplay
pip3 install box2d
sudo apt-get install xvfb
That's just it. Use the following snippet to configure how your matplotlib should render :
import matplotlib.pyplot as plt
from pyvirtualdisplay import Display
display = Display(visible=0, size=(1400, 900))
display.start()
is_ipython = 'inline' in plt.get_backend()
if is_ipython:
from IPython import display
plt.ion()
# Load the gym environment
import gym
import matplotlib.pyplot as plt
%matplotlib inline
env = gym.make('LunarLander-v2')
env.seed(23)
# Let's watch how an untrained agent moves around
state = env.reset()
img = plt.imshow(env.render(mode='rgb_array'))
for j in range(200):
# action = agent.act(state)
action = random.choice(range(4))
img.set_data(env.render(mode='rgb_array'))
plt.axis('off')
display.display(plt.gcf())
display.clear_output(wait=True)
state, reward, done, _ = env.step(action)
if done:
break
env.close()