Write each row received over PubSub to its own file on Cloud Storage

前端 未结 1 1779
灰色年华
灰色年华 2020-12-22 03:00

I am receiving messages via pubsub. Each message should be stored in its own file in GCS as rough data, execute some processing on the data, and then save it to big query- h

相关标签:
1条回答
  • 2020-12-22 03:34

    The best option is #2 - a simple DoFn that creates the files according to your data. Something like this:

    class CreateFileFn extends DoFn<String, Void> {
      @ProcessElement
      public void process(ProcessContext c) throws IOException {
        String filename = ...generate filename from element...;
        try (WritableByteChannel channel = FileSystems.create(
                FileSystems.matchNewResource(filename, false),
                "application/text-plain")) {
          OutputStream out = Channels.newOutputStream(channel);
          ...write the element to out...
        }
      }
    }
    
    0 讨论(0)
提交回复
热议问题