How to connect to private storage bucket using the Google Colab TPU

半世苍凉 提交于 2020-07-30 21:35:42

问题


I am using google colab pro and the provided TPU. I need to upload a pre-trained model into the TPU.

  • TPU can load data only from a google cloud storage bucket.
  • I created a cloud storage bucket and extracted the pre-trained model files in the bucket.

Now I need to give permission to the TPU to access my private bucket, but I don't know the service account of the TPU. How do I find it?

For now I just have All:R read permission to the bucket and the TPU initialized successfully but clearly this is not the optimal solution.


回答1:


I've been struggling with this scenario myself (although with the free version of Colab) and just got it to work. This specific use case doesn't appear to be very well-documented—it seems the official documentation mostly deals with cases involving a Compute Engine VM, rather than an auto-assigned TPU. The process that worked for me went as follows:

  1. Run Google Cloud SDK authentication and set the project (these two things may be redundant—I haven't yet tried doing just one or the other)
!gcloud auth login
!gcloud config set project [Project ID of Storage Bucket]

and

from google.colab import auth
auth.authenticate_user()
  1. Initialize TPU (from Tensorflow TPU docs)
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)
  1. Try to load the model
model = tf.keras.models.load_model('gs://[Bucket name and path to saved model]')

This initially failed, but the error message included the service account of the TPU trying to access the directory, and this is the address I gave access to as described in the Cloud Storage docs. The address is in the service-[PROJECT_NUMBER]@cloud-tpu.iam.gserviceaccount.com format but the project number isn't the Project ID of the project my bucket is in, nor a value I've been able to find anywhere else.

After I gave permissions to that service account (which I was only able to find in the error message), I was able to load and save models from my private bucket.




回答2:


As stated in the public documentation in order to find the service account of your Colab TPU you just need to replace the project number in the following mail address:

 service-[PROJECT_NUMBER]@cloud-tpu.iam.gserviceaccount.com

You can find your project number in the dashboard of your Google Cloud Project

After doing this you should set the access to your bucket as fine-grained access and provide access for this this account in the ACL of your bucket



来源:https://stackoverflow.com/questions/61448884/how-to-connect-to-private-storage-bucket-using-the-google-colab-tpu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!