ImportError: No module named numpy - Google Cloud Dataproc when using Jupyter Notebook

ぃ、小莉子 提交于 2019-12-05 19:12:30

I found a solution.

import sys

sys.path.append('/usr/lib/python2.7/dist-packages')

os.system("sudo apt-get install python-pandas -y")
os.system("sudo apt-get install python-numpy -y")
os.system("sudo apt-get install python-scipy -y")
os.system("sudo apt-get install python-sklearn -y")

import pandas
import numpy
import scipy
import sklearn

If any one has a more elegant solution, please let me know.

Try conda install numpy as Google's jupyter init script is using conda. I personally prefer to have my own init scripts so I can have more control.

Did you try: pip install ipython[numpy]

    #!/usr/bin/env bash
    set -e

    ROLE=$(curl -f -s -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-role)
    INIT_ACTIONS_REPO=$(curl -f -s -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/INIT_ACTIONS_REPO || true)
    INIT_ACTIONS_REPO="${INIT_ACTIONS_REPO:-https://github.com/GoogleCloudPlatform/dataproc-initialization-actions.git}"
    INIT_ACTIONS_BRANCH=$(curl -f -s -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/INIT_ACTIONS_BRANCH || true)
    INIT_ACTIONS_BRANCH="${INIT_ACTIONS_BRANCH:-master}"
    DATAPROC_BUCKET=$(curl -f -s -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/attributes/dataproc-bucket)

    echo "Cloning fresh dataproc-initialization-actions from repo $INIT_ACTIONS_REPO and branch $INIT_ACTIONS_BRANCH..."
    git clone -b "$INIT_ACTIONS_BRANCH" --single-branch $INIT_ACTIONS_REPO
    # Ensure we have conda installed.
    ./dataproc-initialization-actions/conda/bootstrap-conda.sh
    #./dataproc-initialization-actions/conda/install-conda-env.sh

    source /etc/profile.d/conda_config.sh

    if [[ "${ROLE}" == 'Master' ]]; then
        conda install jupyter

        if gsutil -q stat "gs://$DATAPROC_BUCKET/notebooks/**"; then
           echo "Pulling notebooks directory to cluster master node..."
           gsutil -m cp -r gs://$DATAPROC_BUCKET/notebooks /root/
        fi

        ./dataproc-initialization-actions/jupyter/internal/setup-jupyter-kernel.sh
        ./dataproc-initialization-actions/jupyter/internal/launch-jupyter-kernel.sh
     fi

     if gsutil -q stat "gs://$DATAPROC_BUCKET/scripts/**"; then
         echo "Pulling scripts directory to cluster master and worker nodes..."
         gsutil -m cp -r gs://$DATAPROC_BUCKET/scripts/*     /usr/local/bin/miniconda/lib/python2.7
     fi 

     if gsutil -q stat "gs://$DATAPROC_BUCKET/modules/**"; then
        echo "Pulling modules directory to cluster master and worker nodes..."
        gsutil -m cp -r gs://$DATAPROC_BUCKET/modules/*    /usr/local/bin/miniconda/lib/python2.7   
     fi 

     echo "Completed installing Jupyter!"

# Install Jupyter extensions (if desired)
# TODO: document this in readme
if [[ ! -v $INSTALL_JUPYTER_EXT ]]
    then
    INSTALL_JUPYTER_EXT=false
fi
if [[ "$INSTALL_JUPYTER_EXT" = true ]]
then
    echo "Installing Jupyter Notebook extensions..."
    ./dataproc-initialization-actions/jupyter/internal/bootstrap-jupyter-ext.sh
    echo "Jupyter Notebook extensions installed!"
fi
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!