I have succesfully implemented a dataflow pipeline that writes to BigQuery. This pipeline is transforming data for a Cloud ML Engine job. However, I noticed that the rows th
BigQuery tables don't have the concept of order or grouping, they are just a bag of rows; if one needs ordering or grouping, one writes a query with an ORDER BY or GROUP BY clause. If you have code that reads rows from BigQuery and requires these rows to be read in random order, you can do something like https://www.oreilly.com/learning/repeatable-sampling-of-data-sets-in-bigquery-for-machine-learning