Can't import Airflow plugins

后端 未结 11 1192
傲寒
傲寒 2021-01-03 18:10

Following Airflow tutorial here.

Problem: The webserver returns the following error

Broken DAG: [/usr/local/airflow/dags/test_operat         


        
相关标签:
11条回答
  • 2021-01-03 18:34

    After struggling with the Airflow documentation and trying some of the answers here without success, I found this approach from astronomer.io.

    As they point out, building an Airflow Plugin can be confusing and perhaps not the best way to add hooks and operators going forward.

    Custom hooks and operators are a powerful way to extend Airflow to meet your needs. There is however some confusion on the best way to implement them. According to the Airflow documentation, they can be added using Airflow’s Plugins mechanism. This however, overcomplicates the issue and leads to confusion for many people. Airflow is even considering deprecating using the Plugins mechanism for hooks and operators going forward.

    So instead of messing around with the Plugins API I followed Astronomer's approach, setting up Airflow as shown below.

    dags
    └── my_dag.py               (contains dag and tasks)
    plugins
    ├── __init__.py
    ├── hooks
    │   ├── __init__.py
    │   └── mytest_hook.py      (contains class MyTestHook)
    └── operators
        ├── __init__.py
        └── mytest_operator.py  (contains class MyTestOperator)
    

    With this approach, all the code for my operator and hook live entirely in their respective files - and there's no confusing plugin file. All the __init__.py files are empty (unlike some equally confusing approaches of putting Plugin code in some of them).

    For the imports needed, consider how Airflow actually uses the plugins directory:

    When Airflow is running, it will add dags/, plugins/, and config/ to PATH

    This means that doing from airflow.operators.mytest_operator import MyTestOperator probably isn't going to work. Instead from operators.mytest_operator import MyTestOperator is the way to go (note the alignment tofrom directory/file.py import Class in my setup above).

    Working snippets from my files are shown below.

    my_dag.py:

    from airflow import DAG
    from operators.mytest_operator import MyTestOperator
    default_args = {....}
    dag = DAG(....)
    ....
    mytask = MyTestOperator(task_id='MyTest Task', dag=dag)
    ....
    

    my_operator.py:

    from airflow.models import BaseOperator
    from hooks.mytest_hook import MyTestHook
    
    class MyTestOperator(BaseOperator):
        ....
        hook = MyTestHook(....)
        ....
    

    my_hook.py:

    class MyTestHook():
        ....
    

    This worked for me and was much simpler than trying to subclass AirflowPlugin. However it might not work for you if you want changes to the webserver UI:

    Note: The Plugins mechanism still must be used for plugins that make changes to the webserver UI.

    As an aside, the errors I was getting before this (that are now resolved):

    ModuleNotFoundError: No module named 'mytest_plugin.hooks.mytest_hook'
    ModuleNotFoundError: No module named 'operators.mytest_plugin'
    
    0 讨论(0)
  • 2021-01-03 18:34

    I faced the same issue following the same tutorial. What worked for me was to replace the import of MyFirstOperator with:

    from airflow_home.plugins.my_operators import MyFirstOperator
    
    0 讨论(0)
  • 2021-01-03 18:35

    I had to update the plugin path in file airflow.cfg in order to fix the problem.

    Where your Airflow plugins are stored:

    plugins_folder = /airflow/plugins
    
    0 讨论(0)
  • 2021-01-03 18:35

    As per the docs -

    The python modules in the plugins folder get imported, and hooks, operators, sensors, macros, executors and web views get integrated to Airflow’s main collections and become available for use.

    and works fine in version 1.10.1

    0 讨论(0)
  • 2021-01-03 18:53

    Let's say, following is the custom plugin that you have implemented in my_operators.py,

    class MyFirstPlugin(AirflowPlugin):
        name = "my_first_plugin"
        operators = [MyFirstOperator]
    

    Then as per the Airflow documentation, you have to import in the following structure,

    from airflow.{type, like "operators", "sensors"}.{name specified inside the plugin class} import *
    

    So, you should import like the following in your case,

    from airflow.operators.my_first_plugin import MyFirstOperator
    
    0 讨论(0)
  • 2021-01-03 18:54

    In the article it does like this:

    class MyFirstPlugin(AirflowPlugin):
        name = "my_first_plugin"
        operators = [MyFirstOperator]
    

    Instead use:

    class MyFirstPlugin(AirflowPlugin):
        name = "my_first_plugin"
        operators = [MyFirstOperator]
        # A list of class(es) derived from BaseHook
        hooks = []
        # A list of class(es) derived from BaseExecutor
        executors = []
        # A list of references to inject into the macros namespace
        macros = []
        # A list of objects created from a class derived
        # from flask_admin.BaseView
        admin_views = []
        # A list of Blueprint object created from flask.Blueprint
        flask_blueprints = []
        # A list of menu links (flask_admin.base.MenuLink)
        menu_links = []
    

    Also don't use:

    from airflow.operators import MyFirstOperator
    

    According to the airflow article on plugins, it should be:

    from airflow.operators.my_first_plugin import MyFirstOperator
    

    If that doesn't work try:

    from airflow.operators.my_operators import MyFirstOperator
    

    If that doesn't work, check your web server log on startup for more information.

    0 讨论(0)
提交回复
热议问题