Spark: PySpark + Cassandra query performance

后端 未结 2 1659
你的背包
你的背包 2021-01-18 19:31

I have setup Spark 2.0 and Cassandra 3.0 on a local machine (8 cores, 16gb ram) for testing purposes and edited spark-defaults.conf as follows:

         


        
2条回答
  •  抹茶落季
    2021-01-18 19:58

    I see that it is very old question but maybe someone needs it now. When running Spark on local machine it is very important to set into SparkConf master "local[*]" that according to documentation allows to run Spark with as many worker threads as logical cores on your machine.

    It helped me to increase performance of count() operation by 100% on local machine comparing to master "local".

提交回复
热议问题