问题
I'm trying to execute a Google Dataflow Application, but it is throw this Exception
java.lang.IllegalArgumentException: No filesystem found for scheme gs
at org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:459)
at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:529)
at org.apache.beam.sdk.io.FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:213)
at org.apache.beam.sdk.io.TextIO$TypedWrite.to(TextIO.java:700)
at org.apache.beam.sdk.io.TextIO$Write.to(TextIO.java:1028)
at br.com.sulamerica.mecsas.ExportacaoDadosPipeline.main(ExportacaoDadosPipeline.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
at java.lang.Thread.run(Thread.java:748)
This is a slice of my Pipeline code
Pipeline.create()
.apply(PubsubIO.readStrings().fromSubscription(subscription))
.apply(new KeyExportacaoDadosToEntityTransform())
.apply(new ListKeyEmpresaSelecionadasTransform())
.apply(ParDo.of(new DoFn<List<Entity>, String>() {
@ProcessElement
public void processElement(ProcessContext c){
c.output(
c.element().stream()
.map(e-> e.getString("dscRazaoSocial"))
.collect(Collectors.joining("\r\n"))
);
}
}))
.apply(TextIO.write().to("gs://<my bucket>"))
.getPipeline()
.run();
And this is the command used to execute my pipeline
mvn -Pdataflow-runner compile exec:java \
-Dexec.mainClass=br.com.xpto.foo.ExportacaoDadosPipeline \
-Dexec.args="--project=<projectID>\
--stagingLocation=gs://dataflow-xpto/exportacao/staging \
--output=gs://dataflow-xpto/exportacao/output \
--runner=DataflowRunner"
回答1:
I was grappling the same issue. So if you are using Maven to build the executable jar your shade plugin should look like this;
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<!-- add Main-Class to manifest file -->
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.main.Application</mainClass>
</transformer>
</transformers>
</configuration>
回答2:
I recently ran into this issue while working on Apache beam Java pipeline using Gradle.
Apply gradle shade plugin 'com.github.johnrengelman.shadow' to resolve this issue.
Pasting my build.gradle file here for future reference -
buildscript {
repositories {
maven {
url "https://plugins.gradle.org/m2/"
}
jcenter()
}
dependencies {
classpath 'com.github.jengelman.gradle.plugins:shadow:5.1.0'
}
}
plugins {
id 'java'
id 'com.github.johnrengelman.shadow' version '5.1.0'
}
sourceCompatibility = 1.8
apply plugin: 'java'
apply plugin: 'com.github.johnrengelman.shadow'
repositories {
mavenLocal()
mavenCentral()
jcenter()
ivy {
url 'http://dl.bintray.com/content/johnrengelman/gradle-plugins'
}
}
dependencies {
// your dependencies here
}
jar {
manifest {
attributes "Main-Class": "your_main_class_wth_package"
}
from {
configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }
}
}
You should see task shadowJar under shadow option in IntelliJ build. Enjoy!
来源:https://stackoverflow.com/questions/53761142/google-dataflow-no-filesystem-found-for-scheme-gs