Jupyter: notebook of code

Probably many of you know know this software better than I know :) but let’s share 5 minutes together to make sure we all are on the same page.

Jupyter notebook (formerly known as IPython notebook) has been developed for the concept of literate programming and it has become extremely popular within last seveal years (article on Nature). As its name says (“notebook”), it is designed for users to program with readability and modularization in mind.

In a notebook, individual block of code execution is done within a cell. All cells in the same notebook live within the same process and namespace scope. You can put explanation of the code (or beyond, like physics for which the code is developed) in a markdown cell. Such explanation can go per code cell. We don’t write a software in a notebook, but usually a higher level program (such as main function) with an explanation of what’s going on.

Though originally developed for Julia, Python and R (hence “Jupyter”), now it supports all kinds of programming language including shell and even fortran(!). Many scientific blog posts are also done in a notebook and blogs to teach you about Jupyter (like this one).

Default code engine

When you start a notebook, you choose the default backend of the notebook. Pick Python3!

import numpy as np

Note that the state of the process are shared among cells. What I just imported is accessible in the next cell.

print(np)
np='foo' # after you execute this cell, np no longer points to numpy module
print(np)
<module 'numpy' from '/usr/local/lib/python3.8/dist-packages/numpy/__init__.py'>
foo

… which means the execution order of cells matter (yes, you can execute whichever cell in any order you want). The object np no longer points to a numpy module.

Shell commands

You can run shell commands with ! in front.

!ls $HOME
 BNB_numu.ipynb				   Vertex_Resolution.ipynb
 Chain_Performance.ipynb		   Visualization.ipynb
 Chain_Performance_2022.ipynb		   Visualization_Doublets.ipynb
 Chain_Performance_Plots.ipynb		   Visualization_DuneND_Workshop.ipynb
 Chain_Training.ipynb			   Visualization_GNNML_Talk_2022.ipynb
 Chain_Training_Check.ipynb		   Visualization_Icarus_Data.ipynb
 Chain_Training_Check_Doublets.ipynb	   Visualization_ME.ipynb
 Chain_Training_ME.ipynb		   Visualization_ME_Ghost.ipynb
 Data_MC_Comparisons.ipynb		   Workshop_ICARUS_Michel.ipynb
 Dataset_Check_AngularDistribution.ipynb   Workshop_ICARUS_Muons.ipynb
 Dataset_Neutrino_Check.ipynb		   Workshop_ICARUS_Neutrino.ipynb
 Datasets.ipynb				   Workshop_Michel.ipynb
 Datasets_MPV_Check.ipynb		   Workshop_Muon.ipynb
 DebugTrainVal.ipynb			   Workshop_Shower_dEdx.ipynb
 Debug_Justin.ipynb			   apdata.pl
 Fails.ipynb				   cpu_bug.py
 Ghost_Statistics.ipynb			   dkoh0207
 GraphSPICE_Inference.ipynb		   felix_lartpc_mlreco3d
 GraphSpice_Edge_Threshold.ipynb	   larcv.root
 Icarus_ACPT_Muons.ipynb		   larcv1.root
 Icarus_Michel_electrons.ipynb		   larcv2
 Icarus_Michel_electrons_Updated.ipynb	   larcv2.root
 Icarus_Muon_Residual_Range.ipynb	   larcv_mpvmpr.root
 Icarus_Muons_Lifetime.ipynb		   larcv_nue_ccqe_v00
 Icarus_Shower_dEdx.ipynb		   larcv_nue_ccqe_v01
 Icarus_Stopping_Muons.ipynb		   larcv_nue_ccqe_v02
 Icarus_Topology_Statistics_Study.ipynb    larcv_nue_ccqe_v04
 ME_CPU_Bug.ipynb			   larcv_nue_v01
 Metrics.ipynb				   larcv_nue_v02
 Metrics_Chain_Ghost.ipynb		   larcv_run5507.root
 Metrics_Page.ipynb			   lartpc_mlreco3d
 MinkowskiEngine			   lartpc_mlreco3d_tutorials
 Misc.ipynb				   log
 Nue.ipynb				   log_trash
 Nue_Energy_Study.ipynb			   michel_flash.csv
 Nue_Selection.ipynb			   michel_flash_muplus_062022_v01_0.csv
 Nue_Selection_Update.ipynb		   mpvmpr_062021_v00
 Output_Chain_Ghost.ipynb		   muE_liquid_argon.txt
 PMT_Michel.ipynb			   ondemand
 Sofia_dEdx.ipynb			   outreach
 Test.ipynb				   protons.csv
'Untitled Folder'			   rain.csv
 Untitled.ipynb				   run_nue.txt
 Untitled1.ipynb			   run_nue_largeant.txt
 Untitled2.ipynb			   sdfgroup
 Untitled3.ipynb			   slacml-kmi2020
 Untitled4.ipynb			   stopping_muons_chi2.csv
 Untitled5.ipynb			   stopping_muons_chi2_data.csv
 Untitled6.ipynb			   weights
 Untitled7.ipynb			   weights_trash

So if you want to install another python module and feel lazy, you can just execute !pip install --user whatever within a cell.

Different language

You can switch to a different language within a cell by specifying with %%, given that the language is supported by your environment. The software container we use doesn’t have much options, but we got bash ;)

%%bash
ls $HOME
BNB_numu.ipynb
Chain_Performance.ipynb
Chain_Performance_2022.ipynb
Chain_Performance_Plots.ipynb
Chain_Training.ipynb
Chain_Training_Check.ipynb
Chain_Training_Check_Doublets.ipynb
Chain_Training_ME.ipynb
Data_MC_Comparisons.ipynb
Dataset_Check_AngularDistribution.ipynb
Dataset_Neutrino_Check.ipynb
Datasets.ipynb
Datasets_MPV_Check.ipynb
DebugTrainVal.ipynb
Debug_Justin.ipynb
Fails.ipynb
Ghost_Statistics.ipynb
GraphSPICE_Inference.ipynb
GraphSpice_Edge_Threshold.ipynb
Icarus_ACPT_Muons.ipynb
Icarus_Michel_electrons.ipynb
Icarus_Michel_electrons_Updated.ipynb
Icarus_Muon_Residual_Range.ipynb
Icarus_Muons_Lifetime.ipynb
Icarus_Shower_dEdx.ipynb
Icarus_Stopping_Muons.ipynb
Icarus_Topology_Statistics_Study.ipynb
ME_CPU_Bug.ipynb
Metrics.ipynb
Metrics_Chain_Ghost.ipynb
Metrics_Page.ipynb
MinkowskiEngine
Misc.ipynb
Nue.ipynb
Nue_Energy_Study.ipynb
Nue_Selection.ipynb
Nue_Selection_Update.ipynb
Output_Chain_Ghost.ipynb
PMT_Michel.ipynb
Sofia_dEdx.ipynb
Test.ipynb
Untitled Folder
Untitled.ipynb
Untitled1.ipynb
Untitled2.ipynb
Untitled3.ipynb
Untitled4.ipynb
Untitled5.ipynb
Untitled6.ipynb
Untitled7.ipynb
Vertex_Resolution.ipynb
Visualization.ipynb
Visualization_Doublets.ipynb
Visualization_DuneND_Workshop.ipynb
Visualization_GNNML_Talk_2022.ipynb
Visualization_Icarus_Data.ipynb
Visualization_ME.ipynb
Visualization_ME_Ghost.ipynb
Workshop_ICARUS_Michel.ipynb
Workshop_ICARUS_Muons.ipynb
Workshop_ICARUS_Neutrino.ipynb
Workshop_Michel.ipynb
Workshop_Muon.ipynb
Workshop_Shower_dEdx.ipynb
apdata.pl
cpu_bug.py
dkoh0207
felix_lartpc_mlreco3d
larcv.root
larcv1.root
larcv2
larcv2.root
larcv_mpvmpr.root
larcv_nue_ccqe_v00
larcv_nue_ccqe_v01
larcv_nue_ccqe_v02
larcv_nue_ccqe_v04
larcv_nue_v01
larcv_nue_v02
larcv_run5507.root
lartpc_mlreco3d
lartpc_mlreco3d_tutorials
log
log_trash
michel_flash.csv
michel_flash_muplus_062022_v01_0.csv
mpvmpr_062021_v00
muE_liquid_argon.txt
ondemand
outreach
protons.csv
rain.csv
run_nue.txt
run_nue_largeant.txt
sdfgroup
slacml-kmi2020
stopping_muons_chi2.csv
stopping_muons_chi2_data.csv
weights
weights_trash

Modifyng shell environment

Another thing we often feel lazy about is to change the shell environment variable. You can do this within the notebook like shown below (or stop the notebook, change environment value, then restart notebook … not good for lazy people).

%env CUDA_VISIBLE_DEVICES=0
env: CUDA_VISIBLE_DEVICES=0

Now try:

! echo $CUDA_VISIBLE_DEVICES
0

Voila! (you should see 0) … but note, the following is not the same:

! export CUDA_VISIBLE_DEVICES=1

Let’s check:

! echo $CUDA_VISIBLE_DEVICES
0

You should get 0, and that’s because ! export executes the command in a sub-shell and the scope is only within the cell.

Run a python script

%run is a handy command to run a python script (or even jupyter notebook actually!). Let’s create a simply python script.

! echo "print('hello world!')" >> hello.py

and run it:

%run hello.py
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!

Time your program

You have a code execution cell and want to measure how much time it takes. Sure you can add such profiling feature to your code, but here’s how you can do in the notebook using %%time.

%%time
import numpy as np
sum = np.sum(np.ones([1000,1000],dtype=np.float32))
CPU times: user 3.12 ms, sys: 3.09 ms, total: 6.21 ms
Wall time: 5.27 ms

Latex

Jupyter supports MathJax, and also you can just run latex

%%latex

I have to write a formula like $\sin^22\theta\left(\frac{1.27\Delta m^2L}{E_\nu}\right)$ to get Ph.D.
\[I have to write a formula like $\sin^22\theta\left(\frac{1.27\Delta m^2L}{E_\nu}\right)$ to get Ph.D.\]

Executing a web script

Well, all of these fun things are running using javascript on html… can we run our own webscript execution command? Sure we can! Here’s an example to hide all Jupyter code blocks using that feature.

from IPython.display import HTML
HTML('''<script>
code_show=false; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.''')
To toggle on/off the raw code, click here.