What's the right syntax for controlling Inclusion and Exclusion of data directories in Setup.py?

允我心安 提交于 2021-01-28 08:09:40

问题


Q: In creating a python distribution using setup.py and MANIFEST.IN, how can I define define the nested data directories that I do and don't want in the final installation directory (Complex example!)

Background: My program has a set of data directories (not source directories). Within each of these main directories, is are some subdirectories with user specific names. In my setup.py, I want to exclude my own data directories, while still including the other subdirectories that all users should have access to.

The file tree AS IT CURRENTLY EXISTS in my Pycharm DEVELOPMENT environment:

 📁 PycharmProjects
    📁 pythonProject
        📁 data_files_directory_1
            📁subdirectory_to_be_EXcluded
                📑 data_file_to_be_EXcluded.txt
            📁subdirectory_to_be_INcluded
                 📑 data_file_to_be_INcluded.txt
            📑 index.html
        📁 data_files_directory_2
            📁subdirectory_to_be_EXcluded
                📑 data_file_to_be_EXcluded.txt
            📁subdirectory_to_be_INcluded
                 📑 data_file_to_be_INcluded.txt
            📑 index.html
        📁 src
            📑 __init__.py
            📑 constants.py
            📑 helper1.py
            📑 helper2.py
            📑 main.py

Expected result:

The file tree I WANT AFTER INSTALLATION on target machine:

 📁 PycharmProjects
    📁 pythonProject
        📁 data_files_folder_1
            📁subdirectory_to_be_INcluded
                 📑 data_file_to_be_INcluded.txt
        📁 data_files_folder_2
            📁subdirectory_to_be_INcluded
                 📑 data_file_to_be_INcluded.txt
            📑 index.html
        📁 src
            📑 __init__.py
            📑 constants.py
            📑 helper1.py
            📑 helper2.py
            📑 main.py

Actual result:

 📁 PycharmProjects
    📁 pythonProject
        📁 data_files_directory_1
            📁subdirectory_to_be_EXcluded
                📑 data_file_to_be_EXcluded.txt
            📁subdirectory_to_be_INcluded
                 📑 data_file_to_be_INcluded.txt
            📑 index.html
        📁 data_files_directory_2
            📁subdirectory_to_be_EXcluded
                📑 data_file_to_be_EXcluded.txt
            📁subdirectory_to_be_INcluded
                 📑 data_file_to_be_INcluded.txt
            📑 index.html
        📁 src
            📑 __init__.py
            📑 constants.py
            📑 helper1.py
            📑 helper2.py
            📑 main.py

What I tried / Source code:

MANIFEST.IN

...
graft data_files_directory_1
graft data_files_directory_2
...

setup.py

setup(
    ...
    # include everything in MANIFEST.IN:
    include_package_data=True, 
    # ...but exclude just these directories */subdirectory_to_be_EXcluded/* from all packages
    exclude_package_data={"": ["*/subdirectory_to_be_EXcluded/*"]},
    ...
)

PROBLEM: As you can see, the exclusion request is being ignored.

I must confess that after heavy use of Google, YouTube and PyCharm documentation on setup.py and installers that I'm not really clear what the correct way is to include and exclude NON-source directories and files. It seems like many of the possible solutions are deprecated!

What is the correct way to do this?

Can someone point me at some good working examples?


回答1:


Here is the solution that eventually worked.

I did remember to delete the old build and dist directories and I also made sure to delete all *.egg-info files as suggested by @jarcobi. But clearing out all the stale files alone was not enough to solve the problem.

What finally worked was to edit setup.py thusly:

setup(
    ...
    packages=find_packages(exclude=["*/subdirectory_to_be_EXcluded/*"]),
    # include everything in MANIFEST.IN:
    include_package_data=True, 
    # ...but exclude just these directories */subdirectory_to_be_EXcluded/* from all packages
    exclude_package_data={"": ["*/subdirectory_to_be_EXcluded/*"]},
    ...
)

and to edit MANIFEST.IN thusly:

...
graft data_files_directory_1/subdirectory_to_be_INcluded
include data_files_directory_1/index.html
graft 

data_files_directory_2/subdirectory_to_be_INcluded include data_files_directory_2/index.html ...

And now I am getting the desired file tree.

Additional remarks: I'm actually still unclear exactly why these specific changes worked but other solutions I tried did not. But I'm now able to progress with my installation so I guess that is good enough, and I am closing this out as solved.

A request: I would like to put out my request to those in the community who write up documentation, how to guides, or who create instructional videos to come out less ambiguous and confusing explanations with many more working Cookbook examples and explanations.

Areas for improvement: For me, one specific area where I am constantly confused, is where one document says an operator operates on "packages" while another indicates the operator works on "directories".

The confusion is exacerbated because sometimes the word "package" is used to mean "only directories that have a init.py file in them".

That word choice of "package" would seem to indicate that those operators would be inappropriate for use any data directories that do not contain a init.py file.

And indeed, in some cases operators do seem limited to only python package directories. Yet some operators do appear to work on any subdirectory even those that do not contain a init.py file.*. Yet some authors refer to them operating on "packages" when "directories" would be less misleading.

Finally, on top of this, "package" can sometime just mean the install tar.gz files created by setup.py sdist, or the .whl files created by setup.py bdist_wheel.

Anyone who could create an authoritative explanation of which setuptools or MANIFEST.IN operators work (and which don't work) on any directory, and which only work on directories with init.py files.

Dear Reader, Are you are our hero?

Anyone who attempts such explanations and successfully avoids falling into this confusing thicket of multiple different meanings of "package" would be doing quite a valuable service to the community.

Are you that hero in the wings ready to take up that Herculean Labor?



来源:https://stackoverflow.com/questions/63477061/whats-the-right-syntax-for-controlling-inclusion-and-exclusion-of-data-director

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!