No outside network access for Jupyter Notebook container spawned by JupyterHub

。_饼干妹妹 提交于 2020-02-02 16:08:17

问题


So, here is what I am trying to achieve:

  • A Jupyterhub server
  • Which when accessed and you are not logged in, takes you to another web server (custom coded in Django)
  • That web server uses OAuth to authenticate a user
  • And a notebook container is spawned.
  • This notebook container must be pre-populated with a token that is used by a custom library baked into the notebook Docker image to authenticate against a service.
  • The notebook container needs to be able to communicate with the web server for further interactions like retrieve results etc.

I have more or less achieved this except for the last part. I am getting a notebook server started but it has no access to the outside world. It can only access the Jupyter Hub (that's why it works!) and nothing else.

Here is my Jupyter Hub config relevant to the DockerSpawner (I'm leaving out the OAuth settings since these work as expected.

# Tell JupyterHub that we want Docker Spawner to be used.
c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'

# And what image should be used by the Docker Spawner
c.DockerSpawner.image = 'jupyter/scipy-notebook:7a0c7325e470'

# The Hub must listen on all interfaces.
c.JupyterHub.hub_ip = '0.0.0.0'

# And this should be the address of the Hub API
c.JupyterHub.hub_connect_ip = 'jupyterhub'

# Ask containers to connect to this network so that they can
# communicate with the Hub.
c.DockerSpawner.network_name = 'djangodockerjupyterdemo_default'

# And let's not make a mess, remove user containers when done.
c.DockerSpawner.remove = True

# We need to set the Notebook Directory
notebook_dir = '/home/jovyan/work'
c.DockerSpawner.notebook_dir = notebook_dir

# Need to tell where to mount the volumes.
c.DockerSpawner.volumes = { 'jupyterhub-user-{username}': notebook_dir }

Please note that djangodockerjupyterdemo_default is being created by docker-compose thanks to the name of the project directory being such. (I know this is not the best thing to do but right now I'm just hoping to have a bare minimal example working.)

Here is my docker-compose:

version: "2"

services:
  database:
      image: "mysql:5.6"
      volumes:
      - ./data:/var/lib/mysql
      environment:
      - MYSQL_ROOT_PASSWORD=test123
      - MYSQL_DATABASE=oauthserver
      - MYSQL_USER=oauthadmin
      - MYSQL_PASSWORD=test123
  webapp:
    image: auth_server:latest
    volumes:
      - ./:/app
    links:
      - database:database
    environment:
      - PYTHONUNBUFFERED=1
      - ENV=DEV
      - DATABASE_HOST=database
      - DATABASE_USER=oauthadmin
      - DATABASE_DBNAME=oauthserver
      - DATABASE_PASSWORD=test123
    hostname: oauthserver.ddi.in
  jupyterhub:
    image: "jupyterhub:test"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:rw"
      - "./jupyterhub:/srv/jupyterhub"
    environment:
      - OAUTH2_AUTHORIZE_URL=http://oauthserver.ddi.in:8000/o/authorize
      - OAUTH2_TOKEN_URL=http://oauthserver.ddi.in:8000/o/token/
    hostname: jhtest.ddi.in
    links:
      - webapp:oauthserver.ddi.in

I use https://hub.docker.com/r/defreitas/dns-proxy-server to access the JupyterHub server by saying "http://jhtest.ddi.in:8000".

Now, once the containers are up, here is what I can confirm:

  • docker execing into webapp or jupyterhub containers and then wgeting a file from some place on the Internet works.
  • docker execing into the spawned Jupyter notebook container and doing the same doesn't. Same goes with trying to use requests.get() from inside the notebook.

How can I make the spawned notebook access the outside world? It's critical for my use case (and I'm sure a reasonable expectation).

PS: I notice there are hardly any examples covering OAuth JupyterHub setup with a custom Django application out there. I hope to publish my example publicly and hopefully it can constitute as a resource on the Jupyter Hub docs.


回答1:


You may want to make a very brief post at the Jupyter Discourse Forum, under the 'JupyterHub' category, highlighting this post to get more expert eyes on it.




回答2:


So I was able to find the solution. I summarize it below.

Adjustments to docker-compose.yml include adding a network_mode: bridge to all the services. This allows the containers to essentially access the outside world. The cost of doing so however is that the containers cannot automatically talk to each other via simple service name reference. But this can easily be solved using links.

The next adjustment was to configure the DockerSpawner to create containers that use the default bridge network instead of some other network. The settings that help with this include:

c.DockerSpawner.network_name = 'bridge'
c.DockerSpawner.use_internal_ip = True
c.DockerSpawner.extra_host_config = {'network_mode': 'bridge'}

Also, since it is not possible for the notebook to discover the main JupyterHub using service name, I adjust the c.JupyterHub.hub_connect_ip to the hostname of the JupyterHub service. Note that the use of a dns-proxy-server mentioned in my question helps resolve the hostname to the container IP.

Hope this helps someone out there. I will be posting the whole Django-OAuth-JupyterHub example soon on my blog.



来源:https://stackoverflow.com/questions/59871431/no-outside-network-access-for-jupyter-notebook-container-spawned-by-jupyterhub

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!