Any success story installing private dependency on GCP Composer Airflow?

匆匆过客 提交于 2020-03-22 07:55:09

问题


Background info

Normally within a container environment I can easily install my private dependency with a requirements.txt like this:

--index-url https://user:pass@some_repo.jfrog.io/some_repo/api/pypi/pypi/simple

some-private-lib

The package "some-private-lib" is the one I wanted to install.

Issue

Within the GCP Composer environment, I tried with the GCloud command ( gcloud composer environments update ENV_NAME --update-pypi-packages-from-file ./requirements.txt --location LOCATION), but it complained about requirements.txt not following formats defined in PEP-508. Then I found this OFFICIAL GUIDE on how to install dependency from private repo, but it isn't super clear. Following the instructions in the guide, I created a file pip.conf with following contents:

[global]
extra-index-url=https://user:pass@some_repo.jfrog.io/some_repo/api/pypi/pypi/simple

and then put it into my environment's GCS bucket: gs://us-central1-xxxx-bucket/config/pip/pip.conf.

Now I run the command (gcloud composer environments update ENV_NAME --update-pypi-packages-from-file ./requirements.txt --location LOCATION) again, with requirements.txt containing only one line: some-private-lib. It failed with a very opaque error: failed: Failed to install PyPI packages.

Question

What did I do wrong? Any other workarounds available? Thx!


回答1:


You can debug this further by looking at the workloads for the GKE cluster associated with your composer instance.

When you install new packages, it spawns jobs in the cluster to build and deploy containers for the new webserver, scheduler, and worker processes. If you look at the logs for these jobs, you can see what happened when it tried to run pip install. If it is unable to access your private repo, the logs will indicate that.

One issue that I've run into that does return the general error you mention is this: the job that builds the image will fail on package dependency conflicts. If there is such a conflict, it will continue to retry the job until the whole process times out.

To be more specific, the requirements file I passed to composer looked like this:

...
requests==2.22.0
...
my-private-lib==0.1

and the requirements file for my-private-lib looked like this:

...
requests==2.23.0

I ultimately solved this by specifying the version requirements in my private library using a version range rather than a specific version.

Again, if your problem is caused by an issue like the above, it will be indicated in the job logs.



来源:https://stackoverflow.com/questions/59027766/any-success-story-installing-private-dependency-on-gcp-composer-airflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!