as titled, how do I know which version of spark has been installed in the CentOS?
The current system has installed cdh5.1.0.
If you want to print the version programmatically use
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local").getOrCreate()
print(spark.sparkContext.version)
From within the scala shell
scala> spark.version
res9: String = 2.4.4
If you want to run it programatically using python
script
You can use this script.py
:
from pyspark.context import SparkContext
from pyspark import SQLContext, SparkConf
sc_conf = SparkConf()
sc = SparkContext(conf=sc_conf)
print(sc.version)
run it with python script.py
or python3 script.py
This above script is also works on python shell.
Using print(sc.version)
directly on the python script won't work. If you run it directly, you will get this error:NameError: name 'sc' is not defined
.