发表新帖

发表新帖

Why is this simple Spark program not utlizing multiple cores?

前端未结

关注

 4  1142

情深已故 2021-02-10 01:39

So, I\'m running this simple program on a 16 core multicore system. I run it by issuing the following.

spark-submit --master local[*] pi.py

And

4条回答

被撕碎了的回忆 (楼主)

2021-02-10 02:28

As none of the above really worked for me (maybe because I didn't really understand them), here is my two cents.

I was starting my job with spark-submit program.py and inside the file I had sc = SparkContext("local", "Test"). I tried to verify the number of cores spark sees with sc.defaultParallelism. It turned out that it was 1. When I changed the context initialization to sc = SparkContext("local[*]", "Test") it became 16 (the number of cores of my system) and my program was using all the cores.

I am quite new to spark, but my understanding is that local by default indicates the use of one core and as it is set inside the program, it would overwrite the other settings (for sure in my case it overwrites those from configuration files and environment variables).

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题