From my machine, I\'ve configured the hadoop core-site.xml
to recognize the gs://
scheme and added gcs-connector-1.2.8.jar as a Hadoop lib. I can run <
As to your first question, "expected" is questionable, but I think I can at least explain. When FileSystem.get() is used the default FileSystem is returned and by default that is HDFS. My guess is that the HDFS client (DistributedFileSystem) has code to prepend scheme + authority automatically to all files in the filesystem.
Instead of using FileSystem.get(conf), try
FileSystem gcsFs = new Path("gs://mybucket/").getFS(conf)
On disadvantages, I could probably argue that if you end up needing to access the object-store directly then you'll end up writing code to interact with the storage APIs directly anyways (and there are things that do not translate very well to the Hadoop FS API, e.g., object composition, complex object write preconditions other than simple object overwrite protection, etc).
I am admittedly biased (working on the team), but if you're intending to use GCS from Hadoop Map/Reduce, from Spark, etc, the GCS connector for Hadoop should be a fairly safe bet.