Google cloud storage - Download file from web

若如初见. 提交于 2020-01-01 09:54:15

问题


I want to use Google cloud storage in my next project. My aim is tracking various web sites and collecting some photos. As, I read the documentation for gsutil; I'm able download the file manually to my server and upload it google cloud storage by using gsutil.

Downloading and uploading files generates so much traffic in my server. Are there a way to let google cloud download file direct from http?


回答1:


This is very easy to do from the Google Cloud Shell as long as your download is less than ~ 4.6 GB. Launch the Cloud Shell (first icon on your top right after you login to your project in GCP) and use wget to download the file you want. For instance, to download 7-Zip type:

wget https://www.7-zip.org/a/7z1805-x64.exe

Now with the file in your Cloud Shell user home you can copy it to a Google Cloud Storage bucket using the gsutil command:

gsutil cp ./7z1805-x64.exe gs://your_bucket_name/

If the file is bigger than 4.6 GB you can still do it but you need to mount the bucket in your Cloud Shell using gcsfuse:

Create a directory in your Cloud Shell user home

 mkdir ~/mybucket

Now mount your bucket in that directory using gcsfuse:

 gcsfuse bucket_name ~/mybucket

Change the current directory to mount point directory:

 cd mybucket

(if you want to have some fun run "df -h ." to see how much space you got in that mount point)

Now use wget to get the file directly into your bucket (sample using 10GB file off the web):

 wget https://speed.hetzner.de/10GB.bin

UPDATE I just found an even easier way which seems to work for all file sizes:

 curl http://speedtest.tele2.net/10GB.zip | gsutil cp - gs://YOUR_BUCKET_NAME/10GB.zip

Basically curl "streams" the data directly to the bucket.




回答2:


Google Cloud Storage only accepts data directly. There's no way to pass it a URL and have it save the contents as an object.

However, there's no reason you couldn't build this functionality yourself. For example, you could set up one or more dedicated GCE instanceS that would load URLs and then save them to GCS. Google doesn't charge for network ingress into GCE or for from GCE into GCS within a region, either, which helps.




回答3:


Google Cloud Storage provides a JSON API. You can make HTTP requests within your application to the JSON API directly, which will direct the file upload and download traffic directly to Google Cloud Storage.

For downloading a file from a public Google Cloud Storage bucket, make a GET request to https://www.googleapis.com/storage/v1/b/<bucket>/o/<object>, where <bucket> is the name of your Google Cloud Storage bucket and <object> is the name of a file in the bucket. This should work without any authentication, but I haven't tried it myself. You can read the docs for this API request here.

For uploading a file to a public bucket, there are multiple options. The simple approach is to make a POST request to https://www.googleapis.com/upload/storage/v1/b/<bucket>/o, where <bucket> is the name of your public bucket. This approach will work best for small files, less than 5 MB in size. You can read the docs for this API request here. Larger uploads will require a different approach, outlined here. Again, I haven't tried this approach myself, but it should work without authentication.

If you need to perform authenticated uploads and downloads, things get a little more complicated. Google Cloud Storage supports signed URLs for upload and download. These URLs describe specific operations on Google Cloud Storage, such as upload or download, and come with a time-sensitive signature. Anyone who has the URL can perform the specified operation on Google Cloud Storage. They're safe to pass around from server to client. You can generate the signed URL on your application's backend and pass it to the frontend. The frontend could then use the URL to upload to Google Cloud Storage directly. More info on signed URLs here.

Finally, if you need to put restrictions on the upload, such as maximum file size, you'll need to use a signed policy document, described here. This is similar to a signed URL, in that it is a URL that should be generated by your application's backend and includes a time-sensitive signature. The policy document is Base64-encoded and is included in the generated URL. It describes the restrictions on the upload. The URL signature includes the policy document, so that Google Cloud Storage knows to apply that specific policy to an upload request to that URL.

Source: My team and I are building a full-stack application hosted on Google Cloud Platform that uses Google Cloud Storage for upload and download. We're using signed policy documents for upload.



来源:https://stackoverflow.com/questions/28749589/google-cloud-storage-download-file-from-web

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!