from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
conf = SparkConf().setAppName(\"Test\").set(\"spark.driver.memory\", \"1g\")
sc = Spa
SqlContext.sql
expects a valid SQL query not a path to the file. Try this:
with open("/home/ubuntu/workload/queryXX.sql") as fr:
query = fr.read()
results = sqlContext.sql(query)
I'm not sure will it answer your question. But if you intend to run query on existing table you can use,
spark-sql -i <Filename_with abs path/.sql>
One more thing, if you have pyspark script you can use spark-submit details in here.
Run spark-sql --help
will give you
CLI options:
-d,--define <key=value> Variable subsitution to apply to hive
commands. e.g. -d A=B or --define A=B
--database <databasename> Specify the database to use
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
--hiveconf <property=value> Use value for given property
--hivevar <key=value> Variable subsitution to apply to hive
commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
console)
So you can execute your sql script like this:
spark-sql -f <your-script>.sql