I have followed Use the BigQuery connector with Spark to successfully get data from a publicly available dataset. I now need to access a bigquery dataset that is owned by on
The issue seems to be here:
Warning: Ignoring non-spark config property: mapred.bq.auth.service.account.json.keyfile=/tmp/keyfile.json
To fix this, you should set Hadoop properties with spark.hadoop
prefix in Spark:
gcloud dataproc jobs submit pyspark ./bq_pyspark.py \
--cluster $CLUSTER --region $REGION \
--properties=spark.hadoop.mapred.bq.auth.service.account.json.keyfile=/tmp/keyfile.json