I am using Google Cloud Dataflow with the Python SDK.
I would like to :
It is not possible to get the contents of a PCollection
directly - an Apache Beam or Dataflow pipeline is more like a query plan of what processing should be done, with PCollection
being a logical intermediate node in the plan, rather than containing the data. The main program assembles the plan (pipeline) and kicks it off.
However, ultimately you're trying to write data to BigQuery tables sharded by date. This use case is currently supported only in the Java SDK and only for streaming pipelines.
For a more general treatment of writing data to multiple destinations depending on the data, follow BEAM-92.
See also Creating/Writing to Parititoned BigQuery table via Google Cloud Dataflow