发表新帖

发表新帖

setting SparkContext for pyspark

后端未结

关注

 3  1746

悲&欢浪女 2021-02-05 11:17

I am newbie with spark and pyspark. I will appreciate if somebody explain what exactly does SparkContext parameter do? And how could I set

3条回答

星月不相逢 (楼主)

2021-02-05 11:46
See here: the spark_context represents your interface to a running spark cluster manager. In other words, you will have already defined one or more running environments for spark (see the installation/initialization docs), detailing the nodes to run on etc. You start a spark_context object with a configuration which tells it which environment to use and, for example, the application name. All further interaction, such as loading data, happen as methods of the context object.

For the simple examples and testing, you can run the spark cluster "locally", and skip much of the detail of what is above, e.g.,
```
./bin/pyspark --master local[4]
```
will start an interpreter with a context already set to use four threads on your own CPU.

In a standalone app, to be run with sparksubmit:
```
from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题