Cannot access airflow web server via AWS load balancer HTTPS because airflow redirects me to HTTP

后端 未结 4 746
夕颜
夕颜 2021-01-05 08:32

I have an airflow web server configured at EC2, it listens at port 8080.

I have an AWS ALB(application load balancer) in front of the EC2, listen at https 80 (facing

相关标签:
4条回答
  • 2021-01-05 09:12

    Finally I found a solution myself.

    I introduced a nginx reverse proxy between ALB and airflow web server: ie. https request ->ALB:443 ->nginx proxy: 80 ->web server:8080. I make the nginx proxy tell the airflow web server that the original request is https not http by adding a http header "X-Forwarded-Proto https".

    The nginx server is co-located with the web server. and I set the config of it as /etc/nginx/sites-enabled/vhost1.conf (see below). Besides, I deletes the /etc/nginx/sites-enabled/default config file.

    server {
        listen 80;
        server_name <domain>;
        index index.html index.htm;
        location / {
          proxy_pass_header Authorization;
          proxy_pass http://localhost:8080;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto https;
          proxy_http_version 1.1;
          proxy_redirect off;
          proxy_set_header Connection "";
          proxy_buffering off;
          client_max_body_size 0;
          proxy_read_timeout 36000s;
        }
    }
    
    0 讨论(0)
  • 2021-01-05 09:13

    Since they're using Gunicorn - you can configure the forwarded_allow_ips value as an evironment variable instead of having to use an intermediary proxy like Nginx.

    In my case I just set FORWARDED_ALLOW_IPS = * and it's working perfectly fine.

    In ECS you can set this in the webserver task configuration if you're using one docker image for all the Airflow tasks (webserver, scheduler, worker, etc.).

    0 讨论(0)
  • 2021-01-05 09:22

    User user389955 own solution is probably the best approach, but for anyone looking for a quick fix (or want a better idea on what is going on), this seems to be the culprit.

    In the following file (python distro may differ):

    /usr/local/lib/python3.5/dist-packages/gunicorn/config.py

    The following section prevents forwarded for headers from anything other than local

    class ForwardedAllowIPS(Setting):
        name = "forwarded_allow_ips"
        section = "Server Mechanics"
        cli = ["--forwarded-allow-ips"]
        meta = "STRING"
        validator = validate_string_to_list
        default = os.environ.get("FORWARDED_ALLOW_IPS", "127.0.0.1")
        desc = """\
            Front-end's IPs from which allowed to handle set secure headers.
            (comma separate).
    
            Set to ``*`` to disable checking of Front-end IPs (useful for setups
            where you don't know in advance the IP address of Front-end, but
            you still trust the environment).
    
            By default, the value of the ``FORWARDED_ALLOW_IPS`` environment
            variable. If it is not defined, the default is ``"127.0.0.1"``.
            """
    

    Changing from 127.0.0.1 to specific IP's or * if IP's unknown will do the trick.

    At this point, I haven't found a way to set this parameter from within airflow config itself. If I find a way, will update my answer.

    0 讨论(0)
  • 2021-01-05 09:26

    I think you have everything working correctly. The redirect you are seeing is expected as the webserver is set to redirect from / to /admin. If you are using curl, you can pass the flag -L / --location to follow redirects and it should bring you to the list of DAGs.

    Another good endpoint to test on is https://<airflow domain name>/health (with no trailing slash, or you'll get a 404!). It should return "The server is healthy!".

    Be sure you have https:// in the base_url under the webserver section of your airflow config.

    0 讨论(0)
提交回复
热议问题