I have a directory structure similar to the following
meta_project
project1
__init__.py
lib
module.py
__init__.py
I had almost the same example as you in this notebook where I wanted to illustrate the usage of an adjacent module's function in a DRY manner.
My solution was to tell Python of that additional module import path by adding a snippet like this one to the notebook:
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
sys.path.append(module_path)
This allows you to import the desired function from the module hierarchy:
from project1.lib.module import function
# use the function normally
function(...)
Note that it is necessary to add empty __init__.py
files to project1/ and lib/ folders if you don't have them already.
Came here searching for best practices in abstracting code to submodules when working in Notebooks. I'm not sure that there is a best practice. I have been proposing this.
A project hierarchy as such:
├── ipynb
│ ├── 20170609-Examine_Database_Requirements.ipynb
│ └── 20170609-Initial_Database_Connection.ipynb
└── lib
├── __init__.py
└── postgres.py
And from 20170609-Initial_Database_Connection.ipynb
:
In [1]: cd ..
In [2]: from lib.postgres import database_connection
This works because by default the Jupyter Notebook can parse the cd
command. Note that this does not make use of Python Notebook magic. It simply works without prepending %bash
.
Considering that 99 times out of a 100 I am working in Docker using one of the Project Jupyter Docker images, the following modification is idempotent
In [1]: cd /home/jovyan
In [2]: from lib.postgres import database_connection
I have found that python-dotenv helps solve this issue pretty effectively. Your project structure ends up changing slightly, but the code in your notebook is a bit simpler and consistent across notebooks.
For your project, do a little install.
pipenv install python-dotenv
Then, project changes to:
├── .env (this can be empty)
├── ipynb
│ ├── 20170609-Examine_Database_Requirements.ipynb
│ └── 20170609-Initial_Database_Connection.ipynb
└── lib
├── __init__.py
└── postgres.py
And finally, your import changes to:
import os
import sys
from dotenv import find_dotenv
sys.path.append(os.path.dirname(find_dotenv()))
A +1 for this package is that your notebooks can be several directories deep. python-dotenv will find the closest one in a parent directory and use it. A +2 for this approach is that jupyter will load environment variables from the .env file on startup. Double whammy.
I have just found this pretty solution:
import sys; sys.path.insert(0, '..') # add parent folder path where lib folder is
import lib.store_load # store_load is a file on my library folder
You just want some functions of that file
from lib.store_load import your_function_name
If python version >= 3.3 you do not need init.py file in the folder
So far, the accepted answer has worked best for me. However, my concern has always been that there is a likely scenario where I might refactor the notebooks
directory into subdirectories, requiring to change the module_path
in every notebook. I decided to add a python file within each notebook directory to import the required modules.
Thus, having the following project structure:
project
|__notebooks
|__explore
|__ notebook1.ipynb
|__ notebook2.ipynb
|__ project_path.py
|__ explain
|__notebook1.ipynb
|__project_path.py
|__lib
|__ __init__.py
|__ module.py
I added the file project_path.py
in each notebook subdirectory (notebooks/explore
and notebooks/explain
). This file contains the code for relative imports (from @metakermit):
import sys
import os
module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
sys.path.append(module_path)
This way, I just need to do relative imports within the project_path.py
file, and not in the notebooks. The notebooks files would then just need to import project_path
before importing lib
. For example in 0.0-notebook.ipynb
:
import project_path
import lib
The caveat here is that reversing the imports would not work. THIS DOES NOT WORK:
import lib
import project_path
Thus care must be taken during imports.
Here's my 2 cents:
import sys
sys.path.append('/Users/John/Desktop')
import mapping #mapping.py is the name of my module file
shipit = mapping.Shipment() #Shipment is the name of the class I need to use in the mapping module
from mapping import Mapping
shipit = Shipment() #Now you don't have to use the .notation