PySpark java.io.IOException: No FileSystem for scheme: https

前端未结

关注

 3  1251

I am using local windows and trying to load the XML file with the following code on python, and i am having this error, do anyone knows how to resolve it,

相关标签:

3条回答

遥遥无期

2021-01-19 06:41
I've commit a similar but slightly different error: forgot the "s3://" prefix to file path. After adding this prefix to form "s3://path/to/object" the following code works:
```
my_data = spark.read.format("com.databricks.spark.csv")\
               .option("header", "true")\
               .option("inferSchema", "true")\
               .option("delimiter", ",")\
               .load("s3://path/to/object")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
陌清茗

2021-01-19 06:55

The error message says it all: you cannot use dataframe reader & load to access files on the web (http or htpps). I suggest you first download the file locally.

See the pyspark.sql.DataFrameReader docs for more on the available sources (in general, local file system, HDFS, and databases via JDBC).

Irrelevantly to the error, notice that you seem to use the format part of the command incorrectly: assuming that you use the XML Data Source for Apache Spark package, the correct usage should be format('com.databricks.spark.xml') (see the example).

0 讨论(0)
发布评论:

提交评论
- 加载中...
花落未央

2021-01-19 06:56
Somehow pyspark is unable to load the http or https, one of my colleague found the answer for this so here is the solution,

before creating the spark context and sql context we need to load this two line of code
```
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.databricks:spark-xml_2.11:0.4.1 pyspark-shell'
```
after creating the sparkcontext and sqlcontext from sc = pyspark.SparkContext.getOrCreate and sqlContext = SQLContext(sc)

add the http or https url into the sc by using sc.addFile(url)
```
Data_XMLFile = sqlContext.read.format("xml").options(rowTag="anytaghere").load(pyspark.SparkFiles.get("*_public.xml")).coalesce(10).cache()
```
this solution worked for me
0 讨论(0)
发布评论:

提交评论
- 加载中...