How to run OpenAI Gym .render() over a server

妖精的绣舞 提交于 2019-12-02 13:56:48

Got a simple solution working:

If on a linux server, open jupyter with
$ xvfb-run -s "-screen 0 1400x900x24" jupyter notebook 
In Jupyter
import matplotlib.pyplot as plt %matplotlib inline from IPython import display 
After each step
def show_state(env, step=0, info=""):     plt.figure(3)     plt.clf()     plt.imshow(env.render(mode='rgb_array'))     plt.title("%s | Step: %d %s" % (env._spec.id,step, info))     plt.axis('off')      display.clear_output(wait=True)     display.display(plt.gcf()) 

Note: if your environment is not unwrapped, pass env.env to show_state.

This GitHub issue gave an answer that worked great for me. It's nice because it doesn't require any additional dependencies (I assume you already have matplotlib) or configuration of the server.

Just run, e.g.:

import gym import matplotlib.pyplot as plt %matplotlib inline  env = gym.make('Breakout-v0') # insert your favorite environment render = lambda : plt.imshow(env.render(mode='rgb_array')) env.reset() render() 

Using mode='rgb_array' gives you back a numpy.ndarray with the RGB values for each position, and matplotlib's imshow (or other methods) displays these nicely.

Note that if you're rendering multiple times in the same cell, this solution will plot a separate image each time. This is probably not what you want. I'll try to update this if I figure out a good workaround for that.

Update to render multiple times in one cell

Based on this StackOverflow answer, here's a working snippet (note that there may be more efficient ways to do this with an interactive plot; this way seems a little laggy on my machine):

import gym from IPython import display import matplotlib.pyplot as plt %matplotlib inline  env = gym.make('Breakout-v0') env.reset() for _ in range(100):     plt.imshow(env.render(mode='rgb_array'))     display.display(plt.gcf())     display.clear_output(wait=True)     action = env.action_space.sample()     env.step(action) 

Update to increase efficiency

On my machine, this was about 3x faster. The difference is that instead of calling imshow each time we render, we just change the RGB data on the original plot.

import gym from IPython import display import matplotlib import matplotlib.pyplot as plt %matplotlib inline  env = gym.make('Breakout-v0') env.reset() img = plt.imshow(env.render(mode='rgb_array')) # only call this once for _ in range(100):     img.set_data(env.render(mode='rgb_array')) # just update the data     display.display(plt.gcf())     display.clear_output(wait=True)     action = env.action_space.sample()     env.step(action) 

I managed to run and render openai/gym (even with mujoco) remotely on a headless server.

# Install and configure X window with virtual screen sudo apt-get install xserver-xorg libglu1-mesa-dev freeglut3-dev mesa-common-dev libxmu-dev libxi-dev # Configure the nvidia-x sudo nvidia-xconfig -a --use-display-device=None --virtual=1280x1024 # Run the virtual screen in the background (:0) sudo /usr/bin/X :0 & # We only need to setup the virtual screen once  # Run the program with vitural screen DISPLAY=:0 <program>  # If you dont want to type `DISPLAY=:0` everytime export DISPLAY=:0 

Usage:

DISPLAY=:0 ipython2 

Example:

import gym env = gym.make('Ant-v1') arr = env.render(mode='rgb_array') print(arr.shape) # plot or save wherever you want # plt.imshow(arr) or scipy.misc.imsave('sample.png', arr) 

I think we should just capture renders as video by using OpenAI Gym wrappers.Monitor and then display it within the Notebook.

Example:

Dependencies

!apt install python-opengl !apt install ffmpeg !apt install xvfb !pip3 install pyvirtualdisplay  # Virtual display from pyvirtualdisplay import Display  virtual_display = Display(visible=0, size=(1400, 900)) virtual_display.start() 

Capture as video

import gym from gym import wrappers  env = gym.make("SpaceInvaders-v0") env = wrappers.Monitor(env, "/tmp/SpaceInvaders-v0")  for episode in range(2):     observation = env.reset()     step = 0     total_reward = 0      while True:         step += 1         env.render()         action = env.action_space.sample()         observation, reward, done, info = env.step(action)         total_reward += reward         if done:             print("Episode: {0},\tSteps: {1},\tscore: {2}"                   .format(episode, step, total_reward)             )             break env.close() 

Display within Notebook

import os import io import base64 from IPython.display import display, HTML  def ipython_show_video(path):     """Show a video at `path` within IPython Notebook     """     if not os.path.isfile(path):         raise NameError("Cannot access: {}".format(path))      video = io.open(path, 'r+b').read()     encoded = base64.b64encode(video)      display(HTML(         data="""         <video alt="test" controls>         <source src="data:video/mp4;base64,{0}" type="video/mp4" />         </video>         """.format(encoded.decode('ascii'))     ))  ipython_show_video("/tmp/SpaceInvaders-v0/openaigym.video.4.10822.video000000.mp4") 

I hope it helps. ;)

There's also this solution using pyvirtualdisplay (an Xvfb wrapper). One thing I like about this solution is you can launch it from inside your script, instead of having to wrap it at launch:

from pyvirtualdisplay import Display display = Display(visible=0, size=(1400, 900)) display.start() 

I ran into this myself. Using xvfb as X-server somehow clashes with the Nvidia drivers. But finally this post pointed me into the right direction. Xvfb works without any problems if you install the Nvidia driver with the -no-opengl-files option and CUDA with --no-opengl-libs option. If you know this, it should work. But as it took me quite some time till I figured this out and it seems like I'm not the only one running into problems with xvfb and the nvidia drivers.

I wrote down all necessary steps to set everything up on an AWS EC2 instance with Ubuntu 16.04 LTS here.

Referencing my other answer here: Display OpenAI gym in Jupyter notebook only

I made a quick working example here which you could fork: https://kyso.io/eoin/openai-gym-jupyter with two examples of rendering in Jupyter - one as an mp4, and another as a realtime gif.

The .mp4 example is quite simple.

import gym from gym import wrappers  env = gym.make('SpaceInvaders-v0') env = wrappers.Monitor(env, "./gym-results", force=True) env.reset() for _ in range(1000):     action = env.action_space.sample()     observation, reward, done, info = env.step(action)     if done: break env.close() 

Then in a new cell Jupyter cell, or download it from the server onto some place where you can view the video.

import io import base64 from IPython.display import HTML  video = io.open('./gym-results/openaigym.video.%s.video000000.mp4' % env.file_infix, 'r+b').read() encoded = base64.b64encode(video) HTML(data='''     <video width="360" height="auto" alt="test" controls><source src="data:video/mp4;base64,{0}" type="video/mp4" /></video>''' .format(encoded.decode('ascii'))) 

If your on a server with public access you could run python -m http.server in the gym-results folder and just watch the videos there.

I had the same problem and I_like_foxes solution to reinstall nvidia drivers with no opengl fixed things. Here are the commands I used for Ubuntu 16.04 and GTX 1080ti https://gist.github.com/8enmann/931ec2a9dc45fde871d2139a7d1f2d78

I avoided the issues with using matplotlib by simply using PIL, Python Image Library:

import gym, PIL env = gym.make('SpaceInvaders-v0') array = env.reset() PIL.Image.fromarray(env.render(mode='rgb_array')) 

I found that I didn't need to set the XV frame buffer.

I was looking for a solution that works in Colaboratory and ended up with this

from IPython import display import numpy as np import time  import gym env = gym.make('SpaceInvaders-v0') env.reset()  import PIL.Image import io   def showarray(a, fmt='png'):     a = np.uint8(a)     f = io.BytesIO()     ima = PIL.Image.fromarray(a).save(f, fmt)     return f.getvalue()  imagehandle = display.display(display.Image(data=showarray(env.render(mode='rgb_array')), width=450), display_id='gymscr')  while True:     time.sleep(0.01)     env.step(env.action_space.sample()) # take a random action     display.update_display(display.Image(data=showarray(env.render(mode='rgb_array')), width=450), display_id='gymscr') 

EDIT 1:

You could use xvfbwrapper for the Cartpole environment.

from IPython import display from xvfbwrapper import Xvfb import numpy as np import time import pyglet import gym import PIL.Image import io      vdisplay = Xvfb(width=1280, height=740) vdisplay.start()  env = gym.make('CartPole-v0') env.reset()  def showarray(a, fmt='png'):     a = np.uint8(a)     f = io.BytesIO()     ima = PIL.Image.fromarray(a).save(f, fmt)     return f.getvalue()  imagehandle = display.display(display.Image(data=showarray(env.render(mode='rgb_array')), width=450), display_id='gymscr')   for _ in range(1000):   time.sleep(0.01)   observation, reward, done, info = env.step(env.action_space.sample()) # take a random action   display.update_display(display.Image(data=showarray(env.render(mode='rgb_array')), width=450), display_id='gymscr')   vdisplay.stop() 

If you're working with standard Jupyter, there's a better solution though. You can use the CommManager to send messages with updated Data URLs to your HTML output.

IPython Inline Screen Example

In Colab the CommManager is not available. The more restrictive output module has a method called eval_js() which seems to be kind of slow.

This might be a complete workaround, but I used a docker image with a desktop environment, and it works great. The docker image is at https://hub.docker.com/r/dorowu/ubuntu-desktop-lxde-vnc/

The command to run is

docker run -p 6080:80 dorowu/ubuntu-desktop-lxde-vnc 

Then browse http://127.0.0.1:6080/ to access the Ubuntu desktop.

Below are a gif showing it the Mario bros gym environment running and being rendered. As you can see, it is fairly responsive and smooth.

In my IPython environment, Andrew Schreiber's solution can't plot image smoothly. The following is my solution:

If on a linux server, open jupyter with

$ xvfb-run -s "-screen 0 1400x900x24" jupyter notebook 

In Jupyter

import matplotlib.pyplot as plt %matplotlib inline %matplotlib notebook from IPython import display 

Display iteration:

done = False obs = env.reset()  fig = plt.figure() ax = fig.add_subplot(111) plt.ion()  fig.show() fig.canvas.draw()  while not done:     # action = pi.act(True, obs)[0] # pi means a policy which produces an action, if you have     # obs, reward, done, info = env.step(action) # do action, if you have     env_rnd = env.render(mode='rgb_array')     ax.clear()     ax.imshow(env_rnd)     fig.canvas.draw()     time.sleep(0.01) 
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!