Why do I need to set the `transformation_ctx` parameter when calling transformation and sink operations for AWS Glue bookmark to work?

后端 未结 1 1048
终归单人心
终归单人心 2020-12-11 21:54

The AWS Glue Bookmark document (https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html) seems to suggest one has to pass a transformation_ctx par

相关标签:
1条回答
  • 2020-12-11 22:28

    I had the same doubts about the bookmarking months ago (October 2019) and since the documentation provided by Amazon is not very clear I opened a support case to understand more how it is implemented.

    In my Glue Job there was:

    • A read function from S3 (glue_context.create_dynamic_frame.from_options)
    • A ResolveChoice.apply
    • A write function to Redshift (glue_context.write_dynamic_frame.from_jdbc_conf)

    All of these operations has the transformation_ctx value, I tested different possible behaviours (same transformation_ctx for all, different, fixed values, dynamic values ecc).

    After many message with the AWS support they confirm that the bookmarking works only on the read function (They also said with only S3 as a source but I didn't test it), so I ask if the transformation_ctx is useless in the ResolveChoice (and write function too) and they said YES! They confirmed that doesn't make any difference.

    Futhermore for the write function it doesn't change anything, so there is no bookmark logic, no "avoid function" if it has been already run before.

    0 讨论(0)
提交回复
热议问题