Google Dataflow “No filesystem found for scheme gs”

走远了吗. 提交于 2020-02-04 05:54:05

问题


I'm trying to execute a Google Dataflow Application, but it is throw this Exception

java.lang.IllegalArgumentException: No filesystem found for scheme gs
    at org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:459)
    at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:529)
    at org.apache.beam.sdk.io.FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:213)
    at org.apache.beam.sdk.io.TextIO$TypedWrite.to(TextIO.java:700)
    at org.apache.beam.sdk.io.TextIO$Write.to(TextIO.java:1028)
    at br.com.sulamerica.mecsas.ExportacaoDadosPipeline.main(ExportacaoDadosPipeline.java:52)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
    at java.lang.Thread.run(Thread.java:748)

This is a slice of my Pipeline code

Pipeline.create()
        .apply(PubsubIO.readStrings().fromSubscription(subscription))
        .apply(new KeyExportacaoDadosToEntityTransform())
        .apply(new ListKeyEmpresaSelecionadasTransform())
        .apply(ParDo.of(new DoFn<List<Entity>, String>() {
            @ProcessElement
            public void processElement(ProcessContext c){
                c.output(
                    c.element().stream()
                        .map(e-> e.getString("dscRazaoSocial"))
                        .collect(Collectors.joining("\r\n"))
                );
            }
        }))
        .apply(TextIO.write().to("gs://<my bucket>"))
        .getPipeline()
    .run();

And this is the command used to execute my pipeline

mvn -Pdataflow-runner compile exec:java \
  -Dexec.mainClass=br.com.xpto.foo.ExportacaoDadosPipeline \
  -Dexec.args="--project=<projectID>\
  --stagingLocation=gs://dataflow-xpto/exportacao/staging \
  --output=gs://dataflow-xpto/exportacao/output \
  --runner=DataflowRunner"  

回答1:


I was grappling the same issue. So if you are using Maven to build the executable jar your shade plugin should look like this;

                        <transformers>
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                            <!-- add Main-Class to manifest file -->
                            <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <mainClass>com.main.Application</mainClass>
                            </transformer>
                        </transformers>
                    </configuration>



回答2:


I recently ran into this issue while working on Apache beam Java pipeline using Gradle.

Apply gradle shade plugin 'com.github.johnrengelman.shadow' to resolve this issue.

Pasting my build.gradle file here for future reference -

buildscript {
    repositories {
        maven {
           url "https://plugins.gradle.org/m2/"
        }
        jcenter()
    }
    dependencies {
        classpath 'com.github.jengelman.gradle.plugins:shadow:5.1.0'
    }
}


plugins {
    id 'java'
    id 'com.github.johnrengelman.shadow' version '5.1.0'
}


sourceCompatibility = 1.8


apply plugin: 'java'
apply plugin: 'com.github.johnrengelman.shadow'

repositories {
    mavenLocal()
    mavenCentral()
    jcenter()
    ivy {
        url 'http://dl.bintray.com/content/johnrengelman/gradle-plugins'
    }
}

dependencies {
// your dependencies here
}

jar {
    manifest {
        attributes "Main-Class": "your_main_class_wth_package"
    }

    from {
        configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }
    }
}

You should see task shadowJar under shadow option in IntelliJ build. Enjoy!



来源:https://stackoverflow.com/questions/53761142/google-dataflow-no-filesystem-found-for-scheme-gs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!