setting SparkContext for pyspark

后端 未结 3 1744
悲&欢浪女
悲&欢浪女 2021-02-05 11:17

I am newbie with spark and pyspark. I will appreciate if somebody explain what exactly does SparkContext parameter do? And how could I set

3条回答
  •  太阳男子
    2021-02-05 12:03

    The SparkContext object is the driver program. This object co-ordinates the processes over the cluster that you will be running your application on.

    When you run PySpark shell a default SparkContext object is automatically created with variable sc.

    If you create a standalone application you will need to initialize the SparkContext object in your script like below:

    sc = SparkContext("local", "My App")
    

    Where the first parameter is the URL to the cluster and the second parameter is the name of your app.

    I have written an article that goes through the basics of PySpark and Apache which you may find useful: https://programmathics.com/big-data/apache-spark/apache-installation-and-building-stand-alone-applications/

    DISCLAIMER: I am the creator of that website.

提交回复
热议问题