I have a Jupyter notebook (python3) which is a batch job -- it runs three separate python3 notebooks using %run
. I want to invoke a fourth Jupyter R-kernel note
I was able to use the answer to implement two solutions to running an R notebook from a python3 notebook.
!
shell commandAdding a simple !
shell command to the python3 notebook:
!jupyter nbconvert --to notebook --execute r.ipynb
So the notebook looks like this:
%run '1_py3.ipynb'
%run '2_py3.ipynb'
%run '3_py3.ipynb'
!jupyter nbconvert --to notebook --execute 4_R.ipynb
This seems simple and easy to use.
Add this to a cell in the batch notebook:
import nbformat
from nbconvert.preprocessors import ExecutePreprocessor
rnotebook = "r.ipynb"
rnotebook_out = "r_out.ipynb"
rnotebook_path = '/home/jovyan/work/'
with open(rnotebook) as f1:
nb1 = nbformat.read(f1, as_version=4)
ep_R = ExecutePreprocessor(timeout=600, kernel_name='ir')
ep_R.preprocess(nb1, {'metadata': {'path': rnotebook_path}})
with open(rnotebook_out, "wt") as f1:
nbformat.write(nb1, f1)
This is based on the answer from Louise Davies (which is based on the nbcovert docs example), but it only processes one file -- the non-R files can be processed in separate cells with %run
.
If the batch notebook is in the same folder as the notebook it is executing then the path variable can be set with the %pwd
shell magic, which returns the path of the batch notebook.
When we use nbformat.write we choose between replacing the original notebook (which is convenient and intuitive, but could corrupt or destroy the file) and creating a new file for output. A third option if the cell output isn't needed (e.g. in a workflow that manipulates files and writes logs) is to just ignore writing the cell output entirely.
One drawback to both methods is that they do not pipe cell results back into the master notebook display -- as opposed to the way that %run
displays the output of a notebook in its result cell. The !jupyter nbconvert
method appears to show stdout from nbconvert, while the import nbconvert
method showed me nothing.
I don't think you can use the %run
magic command that way as it executes the file in the current kernel.
Nbconvert has an execution API that allows you to execute notebooks. So you could create a shell script that executes all your notebooks like so:
#!/bin/bash
jupyter nbconvert --to notebook --execute 1_py3.ipynb
jupyter nbconvert --to notebook --execute 2_py3.ipynb
jupyter nbconvert --to notebook --execute 3_py3.ipynb
jupyter nbconvert --to notebook --execute 4_R.ipynb
Since your notebooks require no shared state this should be fine. Alternatively, if you really wanna do it in a notebook, you use the execute Python API to call nbconvert from your notebook.
import nbformat
from nbconvert.preprocessors import ExecutePreprocessor
with open("1_py3.ipynb") as f1, open("2_py3.ipynb") as f2, open("3_py3.ipynb") as f3, open("4_R.ipynb") as f4:
nb1 = nbformat.read(f1, as_version=4)
nb2 = nbformat.read(f2, as_version=4)
nb3 = nbformat.read(f3, as_version=4)
nb4 = nbformat.read(f4, as_version=4)
ep_python = ExecutePreprocessor(timeout=600, kernel_name='python3')
#Use jupyter kernelspec list to find out what the kernel is called on your system
ep_R = ExecutePreprocessor(timeout=600, kernel_name='ir')
# path specifies which folder to execute the notebooks in, so set it to the one that you need so your file path references are correct
ep_python.preprocess(nb1, {'metadata': {'path': 'notebooks/'}})
ep_python.preprocess(nb2, {'metadata': {'path': 'notebooks/'}})
ep_python.preprocess(nb3, {'metadata': {'path': 'notebooks/'}})
ep_R.preprocess(nb4, {'metadata': {'path': 'notebooks/'}})
with open("1_py3.ipynb", "wt") as f1, open("2_py3.ipynb", "wt") as f2, open("3_py3.ipynb", "wt") as f3, open("4_R.ipynb", "wt") as f4:
nbformat.write(nb1, f1)
nbformat.write(nb2, f2)
nbformat.write(nb3, f3)
nbformat.write(nb4, f4)
Note that this is pretty much just the example copied from the nbconvert execute API docs: link