I have been using a Beam pipeline examples as a guide in an attempt to load files from S3 for my pipeline. Like in the examples I have defined my own PipelineOptions
that also extends S3Options and I am attempting to use the DefaultAWSCredentialsProviderChain. The code to configure this is:
MyPipelineOptions options = PipelineOptionsFactory.fromArgs(args).as(MyPipelineOptions.class);
options.setAwsCredentialsProvider(new DefaultAWSCredentialsProviderChain());
When I run it from Intellij it works fine using the Direct Runner but when I package it as a jar and it execute it (also using the Direct Runner) I see:
Exception in thread "main" java.lang.IllegalArgumentException: PipelineOptions specified failed to serialize to JSON.
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:166)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at a.b.c.beam.CleanSkeleton.runPipeline(CleanSkeleton.java:69)
at a.b.c.beam.CleanSkeleton.main(CleanSkeleton.java:53)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'awsCredentialsProvider' with value 'com.amazonaws.auth.DefaultAWSCredentialsProviderChain@40f33492'
at com.fasterxml.jackson.databind.JsonMappingException.fromUnexpectedIOE(JsonMappingException.java:338)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsBytes(ObjectMapper.java:3247)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:163)
... 5 more
I am using gradle to build my jar with the following task:
jar {
manifest {
attributes (
'Main-Class': 'a.b.c.beam.CleanSkeleton'
from {
configurations.runtimeClasspath.collect { it.isDirectory() ? it : zipTree(it) }
from('src') {
include '/main/resources/*'
zip64 true
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
The problem was occuring because when the the fat/uber jar was being created, files in META-INF/serivces
where being overwritten by duplicate files. Specifically com.fasterxml.jackson.databind.Module
where a number of Jackson modules needed to be defined but where missing. These include org.apache.beam.sdk.io.aws.options.AwsModule
and com.fasterxml.jackson.datatype.joda.JodaModule
. The code in the DirectRunner
instantiates the ObjectMapper
like so :
new ObjectMapper()
relies on java.util.ServiceLoader
which locates services from META-INF/services/
The solution was to use the gradle Shadow plugin to build the fat/uber jar and configure it to merge the services files:
apply plugin: 'com.github.johnrengelman.shadow'
shadowJar {
zip64 true