Jupyter notebooks batch execution
Author: Sébastien Gardoll
Keywords: cluster jupyter notebook batch nbconvert
Basic batch execution of notebooks
With Jupyter, comes a very interesting command: jupyter-nbconvert (see here for another kind of installation). This command allows the conversion of a notebook file (*.ipynb) into a python, pdf, html, latex, rst, etc. script and especially the execution of a notebook (for a list of all formats, type
The Nbconvert command is very interesting if you want to run a notebook as a batch job on the mesocenter clusters. Knowing that cluster job orchestrators only run scripts, nbconvert is a real deal: it saves you to manage a scripted version of your notebook, just for its execution on clusters (management that inevitably comes with update problems).
The command line is simple by default (for the options, visit this page and visit this page for the templating system). Assuming you have the command in your module environment, the example below runs the notebook named
myfile.ipynb and the result is saved in a file named
myfile.nbconvert.ipynb. More information at this page.
jupyter-nbconvert --to notebook --ExecutePreprocessor.timeout=-1 --execute myfile.ipynb
All you have to do, is to integrate this command line into a bootstrap Bash script and submit it as a batch job to the cluster scheduler.
--ExecutePreprocessor.timeout=-1 disables the cell execution timeout whose default value is 30 seconds.
If you want the result to be saved directly in the notebook, add the
--clear-output option. But it is strongly recommended to use the
--inplace option instead to avoid mixing the result with the previous one.
Conditional execution of notebook cells
Conditional execution of notebook cells is possible using the nbconvert command (see above for its introduction). The first step is to mark the cells that you want to ignore during the execution of the notebook, using the Jupyter marking system:
Save the notebook, then run the notebook with the following command line:
jupyter-nbconvert --to notebook --ExecutePreprocessor.timeout=-1 --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags 'NBCONVERT_IGNORED' --execute myfile.ipynb
Never use the
--clear-output options unless you want to lose the marked cells forever!!!
NBCONVERT_IGNORED marker does not matter but it must be consistent with the command line.
Nbconvert has a lot of options. Combined with the Jupyter marking system, it is possible to emulate conditional code generation (a bit like the C language preprocessor).