以下操作只需在部署Zeppelin的服务器下即可
1、https://www.continuum.io/downloads 下载anaconda
2、安装,我下载的是linux版本的python2.7,安装过程简单,我安装在了/usr/local/anaconda2下
3、替换系统的python版本,我的是python2.6,大家可以在/usr/bin/下看python的版本,我是python2.6,所以
mv /usr/bin/python /usr/bin/python2.6
ln -s /usr/local/anaconda2/bin/python2.7 /usr/bin/python
4、在/conf/zeppelin-env.sh中添加
export PYSPARK_PYTHON=/home/spark-1.6.0-bin-hadoop2.6/python
export PYTHONPATH=/home/spark-1.6.0-bin-hadoop2.6/python:/home/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip
5、编辑/etc/profile,添加
export PYTHONPATH=/home/spark-1.6.0-bin-hadoop2.6/python://home/spark-1.6.0-bin- hadoop2.6/python/lib/py4j-0.9-src.zip
6、启动并打开zeppelin的interpreter,在spark的interpreter添加spark.home
修改zeppelin.pyspark.python(如果不执行第3步,直接在此处填写anaconda的安装目录是不行的)
7、执行conda install matplotlib(默认包含了matplotlib可以不用执行,不存在的包可以这样执行)
8、新建notebook
%pyspark
#这两行必须在最前
import matplotlib
matplotlib.use('Agg')
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import StringIO
#这是必须的,否则图表不显示
def show(p):
img = StringIO.StringIO()
p.savefig(img, format='svg')
img.seek(0)
print "%html <div style='width:600px'>" + img.buf + "</div>"
mu = 100
sigma = 15
x = mu + sigma * np.random.randn(10000)
num_bins = 50
n, bins, patches = plt.hist(x, num_bins, normed=1, facecolor='green', alpha=0.5)
y = mlab.normpdf(bins, mu, sigma)
plt.plot(bins, y, 'r--')
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title(r'Histogram of IQ: $\mu=100$, $\sigma=15$')
plt.subplots_adjust(left=0.15)
show(plt)
来源:oschina
链接:https://my.oschina.net/u/560841/blog/657531