问题
When we realize the data lake with GCP Cloud storage, and data processing with Cloud services such as Dataproc, Dataflow How can we generated data lineage report in GCP. Thanks.
回答1:
Google Cloud Platform doesn't have serverless data lineage offering.
Instead, you may want to install Apache Atlas on Google Cloud Dataproc and use it for data lineage.
回答2:
If data lineage is important for you, you will find yourself wanting an Enterprise Data Cloud.
Cloudera is the main supplier in this space, and will allow you to work on Google Cloud (or anywhere else) with mature data governance.
Though I personally stand behind this message, I do want to mention that I happen to be an employee of Cloudera.
来源:https://stackoverflow.com/questions/55000865/how-can-i-perform-data-lineage-in-gcp