Watson Studio ImportError: No module named 'pydotplus'

问题

Using: Watson Studio Python 3.5 With Spark Python Notebook: https://gist.github.com/anonymous/ea77f500b4fd80feb69fadb470fca235

This part gives the error:

from IPython.display import Image  
import pydotplus
dot_data = tree.export_graphviz(regr, out_file=None, feature_names = X_train.columns.values ,filled=True)  
graph = pydotplus.graph_from_dot_data(dot_data)

Give error: ImportError: No module named 'pydotplus'

Solution Is there another environment that actually has this module installed? OR Is there a way to install/add this python module to the existing runtime?

回答1:

Found the answer in the IBM Cloud documentation.

https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/importing-libraries.html

Installing custom libraries and packages on Apache Spark Last updated: March 1, 2019 2

When you associate Apache Spark with a notebook in Watson Studio, many preinstalled libraries are included. Before you install a library, check the list of preinstalled libraries. Run the appropriate command from a notebook cell:

Python: !pip list --isolated
R: installed.packages()

If the library that you want is not listed, or you want to use a Scala library in a notebook, use the steps in the following sections to install it. The format for library packages depends on the programming language. To use a Scala library

Libraries for Scala notebooks are typically packaged as Java™ archive (JAR) files. To cache a library temporarily

The libraries for a Scala notebook are not installed to the Spark service. Instead they are cached when they are downloaded and are only available for the time that the notebook runs.

To use a single library without dependencies, from a public web server:
    Locate the publicly available URL to the library that you want to install. If you create a custom library, you can post it to any publicly available repository, such as GitHub.

    Download the library you want to use in your notebook by running the following command in a code cell:

     %AddJar URL_to_jar_file  

To use a library with dependencies, from a public Maven repository:

    Add and import a library with all its dependencies by running the following command. You need the groupId, artifactId, and version of the dependency. For example:

     %AddDeps org.apache.spark spark-streaming-kafka_2.10 1.1.0 --transitive

To install a library permanently

You can install a library permanently to ~/data/libs/ if you want to make the files available to spark-submit jobs and Scala kernels, or want to access the files through Java bridges from other kernels, for example, to use JDBC drivers from Python or R.

The file path of the installed library to ~/data/libs/ varies depending on the Scala version that the library requires:

Use ~/data/libs/ for libraries that work with any Scala version.
Use ~/data/libs/scala-2.11/ for libraries that require Scala 2.11. The Scala kernel for Spark 2.1 uses Scala 2.11.

To install a library:

Locate the publicly available URL to the library that you want to install.

Download the library you want to install permanently into ~/data/libs/ by running the following command in a Python notebook:

 !(cd ~/data/libs/ ; wget URL_to_jar_file)

To install a Python library

Use the Python pip package installer command to install Python libraries to your notebook. For example, run the following command in a code cell to install the prettyplotlib library:

 !pip install --user prettyplotlib

The --user flag installs the library for personal usage rather than the global default. The installed packages can be used by all notebooks that use the same Python version in the Spark service.
Use the Python import command to import the library components. For example, run the following command in a code cell:

 import prettyplotlib as ppl

Restart the kernel.

To load an R package

Use the R install.packages() function to install new R packages. For example, run the following command in a code cell to install the ggplot2 package for plotting functions:

 install.packages("ggplot2")

The imported package can be used by all R notebooks running in the Spark service.

Use the R library() function to load the installed package. For example, run the following command in a code cell:

 library("ggplot2")

You can now call plotting functions from the ggplot2 package in your notebook.

来源：https://stackoverflow.com/questions/55524334/watson-studio-importerror-no-module-named-pydotplus

标签

ibm-watson

watson-studio