airflow

Airflow CROSSSLOT Keys in request don't hash to the same slot error using AWS ElastiCache

穿精又带淫゛_ 提交于 2021-01-27 19:56:07
问题 I am running apache-airflow 1.8.1 on AWS ECS and I have an AWS ElastiCache cluster (redis 3.2.4) running 2 shards / 2 nodes with multi-AZ enabled (clustered redis engine). I've verified that airflow can access the host/port of the cluster without any problem. Here's the logs: Thu Jul 20 01:39:21 UTC 2017 - Checking for redis (endpoint: redis://xxxxxx.xxxxxx.clustercfg.usw2.cache.amazonaws.com:6379) connectivity Thu Jul 20 01:39:21 UTC 2017 - Connected to redis (endpoint: redis://xxxxxx.xxxxxx

Airflow LDAP superuser authentication

烈酒焚心 提交于 2021-01-27 17:27:26
问题 I am using Airflow v1.9.0 and am trying to setup groups using LDAP authentication. I can get the basic LDAP authentication working that defaults all users to be superusers. However, I cannot get the AD to match against a specific group. For instance, I have user TommyLeeJones who I know is part of the user group MIB, but I can't get airflow to match this user against this group. In my airflow.cfg file, I have set: [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends

Airflow LDAP superuser authentication

人走茶凉 提交于 2021-01-27 17:20:29
问题 I am using Airflow v1.9.0 and am trying to setup groups using LDAP authentication. I can get the basic LDAP authentication working that defaults all users to be superusers. However, I cannot get the AD to match against a specific group. For instance, I have user TommyLeeJones who I know is part of the user group MIB, but I can't get airflow to match this user against this group. In my airflow.cfg file, I have set: [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends

module 'six.moves' has no attribute 'collections_abc'

坚强是说给别人听的谎言 提交于 2021-01-27 13:00:36
问题 I have a script that connects to YouTube API version 3 and retrieves public data. This script is deployed in airflow, it was working fine for a month, today it failed with this message for the following line: def youtube_search(term,region): youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=DEVELOPER_KEY,cache_discovery=False) File "/usr/local/airflow/.local/lib/python3.6/site-packages/googleapiclient/discovery.py", line 455, in build_from_document if isinstance

Triggering a Prefect workflow externally

老子叫甜甜 提交于 2021-01-27 06:02:48
问题 I currently have a Prefect workflow running locally on an EC2 instance. I can trigger my workflow on localhost:8080 through the UI. Is there a way to trigger a Prefect workflow externally (say AWS Lambda) via REST API or some other way? I know that Airflow supports an experimental REST API. 回答1: Yes through REST API you can trigger it using AWS Lambda, and can schedule AWS Lambda trigger using CloudWatch Events Rule, it supports both fixed rate or crown expression scheduler 回答2: Yes, Prefect

Triggering a Prefect workflow externally

只谈情不闲聊 提交于 2021-01-27 06:01:30
问题 I currently have a Prefect workflow running locally on an EC2 instance. I can trigger my workflow on localhost:8080 through the UI. Is there a way to trigger a Prefect workflow externally (say AWS Lambda) via REST API or some other way? I know that Airflow supports an experimental REST API. 回答1: Yes through REST API you can trigger it using AWS Lambda, and can schedule AWS Lambda trigger using CloudWatch Events Rule, it supports both fixed rate or crown expression scheduler 回答2: Yes, Prefect

How can one use HashiCorp Vault in Airflow?

折月煮酒 提交于 2021-01-27 04:54:32
问题 I am starting to use Apache Airflow and I am wondering how to effectively make it use secrets and passwords stored in Vault. Unfortunately, search does not return meaningful answers beyond a yet-to-be-implemented hook in Airflow project itself. I can always use Python's hvac module to generically access Vault from PythonOperator but I was wondering if there is any better way or a good practice (e.g. maybe an Airflow plugin I missed). 回答1: Airflow >=1.10.10 supports Secrets Backends and

Running DBT within Airflow through the Docker Operator

混江龙づ霸主 提交于 2021-01-25 07:34:30
问题 Building my question on How to run DBT in airflow without copying our repo, I am currently running airflow and syncing the dags via git. I am considering different option to include DBT within my workflow. One suggestion by louis_guitton is to Dockerize the DBT project, and run it in Airflow via the Docker Operator. I have no prior experience using the Docker Operator in Airflow or generally DBT. I am wondering if anyone has tried or can provide some insights about their experience

Running DBT within Airflow through the Docker Operator

北城以北 提交于 2021-01-25 07:32:49
问题 Building my question on How to run DBT in airflow without copying our repo, I am currently running airflow and syncing the dags via git. I am considering different option to include DBT within my workflow. One suggestion by louis_guitton is to Dockerize the DBT project, and run it in Airflow via the Docker Operator. I have no prior experience using the Docker Operator in Airflow or generally DBT. I am wondering if anyone has tried or can provide some insights about their experience

Reuse tasks in airflow

允我心安 提交于 2021-01-24 10:58:06
问题 I'm trying out airflow for orchestrating some of my data pipelines. I'm having multiple tasks for each ingestion pipeline. The tasks are getting repeated across multiple ingestion pipelines. How can I reuse a task across DAGS in airflow? 回答1: Just like object is an instance of a class, an Airflow task is an instance of an Operator (strictly speaking, BaseOperator) So write a "re-usable" (aka generic) operator and use it 100s of times across your pipeline(s) simply by passing different params