Unable to import SparkContext

风流意气都作罢 提交于 2020-05-14 02:24:51

问题


I'm working on CentOS, I've setup $SPARK_HOME and also added path to bin in $PATH.

I can run pyspark from anywhere.

But when I try to create python file and uses this statement;

from pyspark import SparkConf, SparkContext

it throws following error

python pysparktask.py
    Traceback (most recent call last):
    File "pysparktask.py", line 1, in <module>
      from pyspark import SparkConf, SparkContext
    ModuleNotFoundError: No module named 'pyspark'

I tried to install it again using pip.

pip install pyspark

and it gives this error too.

Could not find a version that satisfies the requirement pyspark (from versions: ) No matching distribution found for pyspark

EDIT

based on answer, I updated the code.

error is

Traceback (most recent call last):
  File "pysparktask.py", line 6, in <module>
    from pyspark import SparkConf, SparkContext
  File "/opt/mapr/spark/spark-2.0.1/python/pyspark/__init__.py", line 44, in <module>
    from pyspark.context import SparkContext
  File "/opt/mapr/spark/spark-2.0.1/python/pyspark/context.py", line 33, in <module>
    from pyspark.java_gateway import launch_gateway
  File "/opt/mapr/spark/spark-2.0.1/python/pyspark/java_gateway.py", line 31, in <module>
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient
ModuleNotFoundError: No module named 'py4j'

回答1:


Add the following environment variable and also append spark's lib path to sys.path

import os
import sys

os.environ['SPARK_HOME'] = "/usr/lib/spark/"
sys.path.append("/usr/lib/spark/python/")

from pyspark import SparkConf, SparkContext # And then try to import SparkContext.



回答2:


pip install -e /spark-directory/python/.

this installation will be solve your problem. And you must edit bash_profile

export SPARK_HOME="/spark-directory"


来源:https://stackoverflow.com/questions/43126547/unable-to-import-sparkcontext

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!