This is very close to this question, but I have added a few details specific to my question:
Matplotlib Plotting using AWS-EMR jupyter notebook
I would like
Try below code. FYI we have matplotlib 3.1.1 installed in Python3.6 on emr-5.26.0 and i used PySpark Kernel. Make sure that "%matplotlib inline" is first line in cell
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.show()
Import matplotlib as
import matplotlib.pyplot as plt
and use the magic command %matplot plt
instead as shown in the tutorial here: https://aws.amazon.com/de/blogs/big-data/install-python-libraries-on-a-running-cluster-with-emr-notebooks/
The answer by @00schneider actually works.
import matplotlib.pyplot as plt
# plot data here
plt.show()
after
plt.show()
re-run the magic cell that contains the below, and you will see a plot on your AWS EMR Jupyter PySpark notebook
%matplot plt
As you mentioned, matplotlib
is not installed on the EMR cluster, therefore such error will occur:
However, it is actually available in the managed Jupyter notebook instance (the docker container). Using the %%local
magic will allow you to run the cell locally:
The following should work:
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
Run the entire script in one cell