Gsutil uses a lot of memory when download multiple files with a lot of processes

99封情书 提交于 2020-01-06 05:57:28

问题


I need to download multiple files with gsutil and I notices that gsutil uses a lot of memory when downloading multiple files. (Around 1-2 GB ram when download three 2G files with 9 processes each). Is there a way to tune memory usage of gsutil? This is kind of important to me because I am running gsutil in GKE, and a container will get killed if use too much memory (more than limit)

Another issue: it seems like gsutil can not download files with the same name in a single command (one will overwrite the other?). So I am not using the -m option. Instead I am downloading each file with a single gsutil command: gsutil -o "GSUtil:parallel_thread_count=1" -o "GSUtil:sliced_object_download_component_size=250M" -o "GSUtil:sliced_object_download_max_components=9" -o "GSUtil:parallel_process_count=9" cp bucket/file desFile


回答1:


I did test download the 2GB file and changing -o "GSUtil:parallel_process_count=X" changes memory consumption on Debian and Ubuntu:

  • 1 parallel process: 85MB
  • 5 parallel processes: 125MB
  • 10 parallel processes: 165MB
  • 50 paraller processes: 310MB

If you have kernel panic issues on GKE using gsutil with CentOS container image, switching to Ubuntu image should help.

If the memory consumption is too high for 3 files simultaneous download, you can consider using only 1 or 2 downloads.

There are also known issues of high memory usage with GKE



来源:https://stackoverflow.com/questions/56797730/gsutil-uses-a-lot-of-memory-when-download-multiple-files-with-a-lot-of-processes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!