spark-shell error : No FileSystem for scheme: wasb

后端未结

关注

 2  1870

We have HDInsight cluster in Azure running, but it doesn\'t allow to spin up edge/gateway node at the time of cluster creation. So I was creating this edge/gateway node by i

相关标签:

2条回答

情歌与酒

2021-01-03 00:01
Hai Ning from Microsoft has written an excellent blog post on to setup wasb on an apache hadoop installation.

Here is the summary:
1. Add hadoop-azure-*.jar and azure-storage-*.jar to hadoop classpath
  
  1.1 Find the jars in your local installation. It's at /usr/hdp/current/hadoop-client folder on HDInsight cluster.
  
  1.2 Update HADOOP_CLASSPATH variable at hadoop-env.sh. Use exact jar name as java classpath doesn't support partial wildcard.
2. Update core-site.xml
```
<property>         
        <name>fs.AbstractFileSystem.wasb.Impl</name>                           
        <value>org.apache.hadoop.fs.azure.Wasb</value> 
</property>

<property>
        <name>fs.azure.account.key.my_blob_account_name.blob.core.windows.net</name> 
        <value>my_blob_account_key</value> 
</property>

 
<property>
        <name>fs.defaultFS</name>          
        <value>wasb://my_container_name@my_blob_account_name.blob.core.windows.net</value>
</property>
```
See exact steps here: https://github.com/hning86/articles/blob/master/hadoopAndWasb.md
0 讨论(0)
发布评论:

提交评论
- 加载中...
名媛妹妹

2021-01-03 00:18
Another way of setting Azure Storage (wasb and wasbs files) in spark-shell is:
1. Copy azure-storage and hadoop-azure jars in the ./jars directory of spark installation.
2. Run the spark-shell with the parameters —jars [a comma separated list with routes to those jars] Example:
```
$ bin/spark-shell --master "local[*]" --jars jars/hadoop-azure-2.7.0.jar,jars/azure-storage-2.0.0.jar
```
3. Add the following lines to the Spark Context:
```
sc.hadoopConfiguration.set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
sc.hadoopConfiguration.set("fs.azure.account.key.my_account.blob.core.windows.net", "my_key")
```
4. Run a simple query:
```
sc.textFile("wasb://my_container@my_account_host/myfile.txt").count()
```
5. Enjoy :)
With this settings you could easily could setup a Spark application, passing the parameters to the 'hadoopConfiguration' on the current Spark Context
0 讨论(0)
发布评论:

提交评论
- 加载中...