Exercise 2: Muon stopping power¶¶

The stopping power of a particle usually refers to the energy loss rate \(dE/dx\) when it passes through matter. When charged particles travel through our LArTPC detector, they interact with the argon and lose energy.

Note

Currently this tutorial only computes dQ/dx distribution. Need refresh to make a 2D plot of dQ/dx vs residual range.

MIP and Bragg peak¶¶

Minimally ionizing particles (MIP) are charged particles which lose energy when passing through matter at a rate close to minimal. Particles such as muons often have energy losses close to the MIP level and are treated in practice as MIP. The only exception is when the muon comes to a stop and experiences a Bragg peak.

Fig. 12 Example of muon Bragg peak. The muon is travelling from bottom left to top right. The color scale represents energy deposition. Red means more energy deposited. The sudden increase in deposited (lost) energy towards the end of the muon trajectory is the Bragg peak. From MicroBooNE (arxiv: 1704.02927)¶

I. Motivation¶

We know that the energy loss rate of a MIP in argon is ~2 MeV/cm. Hence our goal is to carefully isolate the MIP-like sections of muons (i.e. leaving out the ends of the track), and compute the (reconstructed) energy loss along these trajectories \(dQ/dx\). This can inform the detector calibration effort, for example, since we can compare the peak of the \(dQ/dx\) histogram with the theoretical expected value (although there are many detector effects that make this not straightforward). We can also study the spatial uniformity of the detector by looking at MIP \(dQ/dx\) values in different regions of the detector, etc. If we plot the dQ/dx change along the so-called “residual range” (i.e. distance to the end of the muon trajectory), we get a characteristic plot (due to the Bragg peak). In this tutorial we will focus on reproducing this plot.

Fig. 13 Example of what we expect for a muon dQ/dx versus residual range 2d histogram. The sudden increase in deposited (lost) energy towards the end of the muon trajectory (= low residual range) is the Bragg peak. From MicroBooNE (arxiv: 2010.02390)¶

II. Setup¶

Again, we start by setting our working environment. Some necessary boilerplate code:

a. Software and data directory¶

import os, sys
SOFTWARE_DIR = '%s/lartpc_mlreco3d' % os.environ.get('HOME')
DATA_DIR = os.environ.get('DATA_DIR')
# Set software directory
sys.path.append(SOFTWARE_DIR)

b. Numpy, Matplotlib, and Plotly for Visualization and data handling.¶

import numpy as np
import pandas as pd
import yaml

from plotly.offline import init_notebook_mode, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=False)

c. MLRECO specific imports for model loading and configuration setup¶

The imports below load some auxiliary functions and classes required to run the full chain in interactive mode.

import warnings
warnings.filterwarnings('ignore')
from mlreco.main_funcs import process_config, prepare

Let’s load the configuration file inference.cfg to setup the full chain architecture and weights. This file uses the keyword DATA_DIR to symbolize the path to validation set and checkpoint file. We need to replace it with the actual location defined previously.

cfg = yaml.load(open('%s/inference.cfg' % DATA_DIR, 'r').read().replace('DATA_DIR', DATA_DIR),Loader=yaml.Loader)

To keep things simple we will ask the chain to stay quiet and we can load a bigger dataset than the 100-events default one:

cfg['model']['modules']['chain']['verbose'] = False
# cfg['iotool']['dataset']['data_keys'] = ['/sdf/group/neutrino/ldomine/mpvmpr_012022/test.root']

So far cfg was simply a dictionary loaded from a YAML configuration file. It needs to be consumed by a helper function process_config to be ready for the full chain usage:

process_config(cfg, verbose=False)

Config processed at: Linux tur012 3.10.0-1160.42.2.el7.x86_64 #1 SMP Tue Sep 7 14:49:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$CUDA_VISIBLE_DEVICES="0"

d. Initialize and load weights to model using Trainer.¶

This next cell loads the dataset and the model to the notebook environment. hs stands for “handlers”. It contains various useful access points, such as the I/O iterator hs.data_io_iter to directly access the dataset or the trainer instance hs.trainer which will enable us to actually run the network.

# prepare function configures necessary "handlers"
hs = prepare(cfg)
# Optionally, you can specifiy a list of images by listing the image entry ID numbers:
# hs = prepare(cfg, event_list=[0, 1, 2])
dataset = hs.data_io_iter

Welcome to JupyROOT 6.22/09

Loading file: /sdf/home/l/ldomine/lartpc_mlreco3d_tutorials/book/data/mpvmpr_062022_test_small.root
Loading tree sparse3d_reco
Loading tree sparse3d_reco_chi2
Loading tree sparse3d_reco_hit_charge0
Loading tree sparse3d_reco_hit_charge1
Loading tree sparse3d_reco_hit_charge2
Loading tree sparse3d_reco_hit_key0
Loading tree sparse3d_reco_hit_key1
Loading tree sparse3d_reco_hit_key2
Loading tree sparse3d_pcluster_semantics_ghost
Loading tree cluster3d_pcluster
Loading tree particle_pcluster
Loading tree particle_mpv
Loading tree sparse3d_pcluster_semantics
Loading tree sparse3d_pcluster
Loading tree particle_corrected
Found 101 events in file(s)

Warning in <TClass::Init>: no dictionary for class larcv::EventNeutrino is available
Warning in <TClass::Init>: no dictionary for class larcv::NeutrinoSet is available
Warning in <TClass::Init>: no dictionary for class larcv::Neutrino is available

Freezing 82 weights for a sub-module ppn

Freezing 141 weights for a sub-module uresnet_lonely

Freezing 141 weights for a sub-module uresnet_deghost

Freezing 146 weights for a sub-module graph_spice

Freezing 120 weights for a sub-module grappa_track

Freezing 120 weights for a sub-module grappa_shower

Restoring weights for  from /sdf/home/l/ldomine/lartpc_mlreco3d_tutorials/book/data/weights_full_mpvmpr_062022.ckpt...

Done.

As usual, the model is now ready to be used (check for successful weight loading). Let’s do one forward iteration to retrieve a handful of events.

data, result = hs.trainer.forward(dataset)

III. Setup the analysis tools¶

The particles in a given image are classified into one of five particle types: photon \(\gamma\), electron \(e\), muon \(\mu\), pion \(\pi\), and proton \(p\). Obtaining particles from the full chain is quite simple: we initialize the FullChainEvaluator for this batch of events and examine the particle composition through the get_particles (for true particles, get_true_particles) method.

from analysis.classes.ui import FullChainEvaluator

# Only run this cell once!
predictor = FullChainEvaluator(data, result, cfg, deghosting=True)

entry = 7    # Batch ID for current sample

IV. Visualize true and predicted particles¶

Let’s first import plotting functions from lartpc_mlreco3d for easier visualization:

from mlreco.visualization import scatter_points, scatter_cubes, plotly_layout3d
from mlreco.visualization.plotly_layouts import white_layout, trace_particles, trace_interactions, dualplot

The evaluator will handle ghost masking internally, so all you have to do is retrieve the predicted and true semantic segmentation labels for visualization and analysis:

pred = predictor.get_predicted_label(entry, 'segment')
truth = predictor.get_true_label(entry, 'segment')
points = predictor.data_blob['input_data'][entry]
# Check if dimensions agree
assert (pred.shape[0] == truth.shape[0])

Let’s plot the voxel-level true and predicted semantic labels side-by-side:

trace1, trace2 = [], []
edep = points[:, 5]

trace1 += scatter_points(points,
                        markersize=1,
                        color=truth,
                        cmin=0, cmax=5, colorscale='rainbow')

trace2 += scatter_points(points,
                        markersize=1,
                        color=pred,
                        cmin=0, cmax=5, colorscale='rainbow')

fig = dualplot(trace1, trace2, titles=['True semantic labels (true no-ghost mask)', 'Predicted semantic labels (predicted no-ghost mask)' ])

iplot(fig)

By default, the label for tracks and michel electrons are 1 and 2, respectively.

track_label = 1
michel_label = 2

particles = predictor.get_particles(entry, only_primaries=False)
true_particles = predictor.get_true_particles(entry, only_primaries=False, verbose=False)

trace1 = trace_particles(particles, color='semantic_type')
trace2 = trace_particles(true_particles, color='semantic_type')

fig = dualplot(trace1, trace2, titles=['Predicted particles (predicted no-ghost mask)', 'True particles (predicted no-ghost mask)'])

iplot(fig)

The predicted particles, each color-coded according to its semantic type, will be displayed in left; the true particles on right.

Step 1: Select stopping muons¶

We will select track-like predicted particles that are close to a Michel predicted particle. For the stopping muon residual range study purpose, purity is more important than efficiency. Hence we only select stopping muons that decay into a Michel electron.

from scipy.spatial.distance import cdist

def get_stopping_muons(particles, Michel_threshold=10):
    selected_muons = []
    closest_points = []

    # Loop over predicted particles
    for p in particles:
        if p.semantic_type != track_label: continue
        coords = p.points

        # Check for presence of Michel electron
        attached_to_Michel = False
        closest_point = None
        for p2 in particles:
            if p2.semantic_type != 2: continue
            d =  cdist(p.points, p2.points)
            if d.min() >= Michel_threshold: continue
            attached_to_Michel = True
            closest_point = d.min(axis=1).argmin()

        if not attached_to_Michel: continue

        selected_muons.append(p)
        closest_points.append(closest_point)

    return selected_muons, closest_points

selected_muons, closest_points = get_stopping_muons(particles)
selected_muons

[Particle( Image ID=0   | Particle ID=13  | Semantic_type: Track           | PID: Muon     | Primary: 0  | Score = 61.74% | Interaction ID: 27 | Size: 2318  ),
 Particle( Image ID=0   | Particle ID=16  | Semantic_type: Track           | PID: Photon   | Primary: 0  | Score = 89.94% | Interaction ID: 16 | Size: 848   ),
 Particle( Image ID=0   | Particle ID=18  | Semantic_type: Track           | PID: Muon     | Primary: 1  | Score = 99.09% | Interaction ID: 21 | Size: 363   ),
 Particle( Image ID=0   | Particle ID=24  | Semantic_type: Track           | PID: Photon   | Primary: 0  | Score = 84.35% | Interaction ID: 24 | Size: 710   ),
 Particle( Image ID=0   | Particle ID=25  | Semantic_type: Track           | PID: Photon   | Primary: 0  | Score = 76.51% | Interaction ID: 41 | Size: 2091  )]

And just as a sanity check, let’s run the same function on the true particles to ensure it selects what we expect:

get_stopping_muons(true_particles)

([TruthParticle( Image ID=0   | Particle ID=0   | Semantic_type: Track           | PID: Muon     | Primary: 1  | Interaction ID: 3  | Size: 362   ),
  TruthParticle( Image ID=0   | Particle ID=17  | Semantic_type: Track           | PID: Muon     | Primary: 1  | Interaction ID: 12 | Size: 2082  ),
  TruthParticle( Image ID=0   | Particle ID=20  | Semantic_type: Track           | PID: Muon     | Primary: 1  | Interaction ID: 6  | Size: 2322  ),
  TruthParticle( Image ID=0   | Particle ID=23  | Semantic_type: Track           | PID: Muon     | Primary: 1  | Interaction ID: 5  | Size: 849   )],
 [0, 0, 2310, 845])

Step 2: Muon direction with PCA¶

Once we have a suitable muon selected, the next step is to bin it. To make the binning easier, we will find the approximate direction of the muon using principal component analysis (PCA) then project all muon voxels along this axis.

from sklearn.decomposition import PCA

def get_PCA(particles):
    pca = PCA(n_components=2)
    pca_coordinates = []
    pca_directions = []
    for p in particles:
        coords = p.points
        # PCA to get a rough direction
        coords_pca = pca.fit_transform(p.points)[:, 0]
        pca_coordinates.append(coords_pca)
        pca_directions.append(pca.components_[0, :])
    return pca_coordinates, pca_directions

Time to run our function on the muon we just selected:

pca_coordinates, pca_directions = get_PCA(selected_muons)

It is always a good idea to visualize the output of each intermediate stage to make sure the output is not garbage for the next step. The muon voxels are in blue, the computed PCA direction is in orange.

trace = []

for idx, (coords, direction) in enumerate(zip(pca_coordinates, pca_directions)):
    trace += scatter_points(selected_muons[idx].points,
                            markersize=1,
                            color='blue')
    # Artificially create a point cloud for the PCA direction visualization
    pca_points = selected_muons[idx].points[coords.argmin()][None, :] + direction[None, :] * (coords[:, None]-coords.min())
    trace += scatter_points(pca_points,
                           markersize=1,
                           color='orange')


fig = go.Figure(data=trace,layout=plotly_layout3d())
fig.layout.scene.xaxis.range=[0, 768]
fig.layout.scene.yaxis.range=[0, 768]
fig.layout.scene.zaxis.range=[0, 768]
iplot(fig)

The orange direction seems to match roughly the muon direction, so we can move on to the next step.

Step 3: Local dQ/dx¶

Using the rough muon direction, we will define segments by binning the projection of all voxels onto this axis.

def get_dQdx(particles, closest_points, pca_coordinates,
            bin_size=17,
            return_pca=False): # about5cm

    df = {
        "dQ": [],
        "dx": [],
        "residual_range": []
    }

    pca = PCA(n_components=2)

    list_coords, list_pca, list_directions = [], [], []
    for p, closest_point, coords_pca in zip(particles, closest_points, pca_coordinates):
        coords = p.points

        # Make sure where the end vs start is
        # if end == 0 we have the right bin ordering, otherwise might need to flip when we record the residual range
        distances_endpoints = [((coords[coords_pca.argmin(), :] - coords[closest_point, :])**2).sum(), ((coords[coords_pca.argmax(), :] - coords[closest_point, :])**2).sum()]
        end = np.argmin(distances_endpoints)

        # Split into segments and compute local dQ/dx
        bins = np.arange(coords_pca.min(), coords_pca.max(), bin_size)
        bin_inds = np.digitize(coords_pca, bins)

        for i in np.unique(bin_inds):
            mask = bin_inds == i
            if np.count_nonzero(mask) < 2: continue
            # Repeat PCA locally for better measurement of dx
            pca_axis = pca.fit_transform(p.points[mask])
            list_coords.append(p.points[mask])
            list_pca.append(pca_axis[:, 0])
            list_directions.append(pca.components_[0, :])

            dx = pca_axis[:, 0].max() - pca_axis[:, 0].min()
            dQ = p.depositions[mask].sum()
            residual_range = (i if end == 0 else len(bins)-i-1) * bin_size

            df["dx"].append(dx)
            df["dQ"].append(dQ)
            df["residual_range"].append(residual_range)

    df = pd.DataFrame(df)

    if return_pca:
        return list_coords, list_pca, list_directions

    return df     

For each segment, we compute

dQ (sum of reconstructed charge of all voxels in the segment)
dx (for better precision we recompute a local PCA direction on the segment voxels exclusively)
residual range (distance from the muon end, well-defined using the Michel contact point)

get_dQdx(selected_muons, closest_points, pca_coordinates)

	dQ	dx	residual_range
0	2948.849976	17.165099	731
1	2790.597678	16.323596	714
2	2511.840777	16.856916	697
3	2678.051848	17.214062	680
4	2095.855970	16.496260	663
...	...	...	...
115	3177.442494	16.610514	527
116	4629.873699	17.091178	544
117	4352.756979	16.927771	561
118	2937.334114	16.737657	578
119	1621.308743	6.778135	595

120 rows × 3 columns

Once again, let’s take the time to visualize the binning and make sure the segments make sense. The black voxels belong to the local PCA directions, the colored voxels belong to different segments (color = segment id).

import matplotlib.pyplot as plt
trace = []

list_coords, list_pca, list_directions = get_dQdx(selected_muons, closest_points, pca_coordinates, return_pca=True)
cmap = plt.cm.get_cmap('Set1')
n = len(list_coords)
for idx, (coords, pca_axis, direction) in enumerate(zip(list_coords, list_pca, list_directions)):
    trace += scatter_points(coords,
                            markersize=2,
                            color='rgb%s' % str(cmap(idx/n)[:-1]))
    #print(direction[None, :] * coords[:, None])
    #print(cmap(idx/n))
    pca_points = coords[pca_axis.argmin()][None, :] + direction[None, :] * (pca_axis[:, None]-pca_axis.min())
    trace += scatter_points(pca_points,
                           markersize=2,
                           color='black')


fig = go.Figure(data=trace,layout=plotly_layout3d())
fig.layout.scene.xaxis.range=[0, 768]
fig.layout.scene.yaxis.range=[0, 768]
fig.layout.scene.zaxis.range=[0, 768]
iplot(fig)

Step 4: Repeat with high statistics¶

We have computed everything we want, we just need to repeat this with higher statistics.

muons = pd.DataFrame({
        "index": [],
        "dQ": [],
        "dx": [],
        "residual_range": []
    })

from tqdm import tqdm
for iteration in tqdm(range(10)):
    data, result = hs.trainer.forward(dataset)
    predictor = FullChainEvaluator(data, result, cfg, deghosting=True, processor_cfg={'michel_primary_ionization_only': True})
    for entry, index in enumerate(predictor.index):  
        particles = predictor.get_particles(entry, only_primaries=False)
        selected_muons, closest_points = get_stopping_muons(particles)
        pca_coordinates, pca_directions = get_PCA(selected_muons)
        df = get_dQdx(selected_muons, closest_points, pca_coordinates)
        df['index'] = index
        muons = pd.concat([muons, df])

  0%|                                                                                                                                                      | 0/10 [00:00<?, ?it/s]

 10%|██████████████▏                                                                                                                               | 1/10 [00:34<05:10, 34.48s/it]

 20%|████████████████████████████▍                                                                                                                 | 2/10 [01:05<04:18, 32.27s/it]

 30%|██████████████████████████████████████████▌                                                                                                   | 3/10 [01:39<03:52, 33.18s/it]

 40%|████████████████████████████████████████████████████████▊                                                                                     | 4/10 [02:07<03:07, 31.18s/it]

 50%|███████████████████████████████████████████████████████████████████████                                                                       | 5/10 [02:31<02:22, 28.51s/it]

 60%|█████████████████████████████████████████████████████████████████████████████████████▏                                                        | 6/10 [03:03<01:59, 29.82s/it]

 70%|███████████████████████████████████████████████████████████████████████████████████████████████████▍                                          | 7/10 [03:29<01:25, 28.52s/it]

 80%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                            | 8/10 [04:07<01:02, 31.48s/it]

 90%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊              | 9/10 [04:33<00:29, 29.91s/it]

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [04:35<00:00, 21.23s/it]

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [04:35<00:00, 27.56s/it]

muons

	index	dQ	dx	residual_range
0	10.0	7042.811442	17.111080	17.0
1	10.0	4577.029618	16.876747	34.0
2	10.0	4333.510592	17.052119	51.0
3	10.0	3344.982315	17.053986	68.0
4	10.0	3292.413970	16.610277	85.0
...	...	...	...	...
2	100.0	4624.215957	16.500761	51.0
3	100.0	4644.637104	16.877741	68.0
4	100.0	4249.540913	16.660054	85.0
5	100.0	3944.112324	16.543286	102.0
6	100.0	3773.959829	15.030738	119.0

5193 rows × 4 columns

Plot dQ/dx vs residual range¶

import matplotlib
import matplotlib.pyplot as plt
import seaborn
seaborn.set(rc={
    'figure.figsize':(15, 10),
})
seaborn.set_context('talk')

clean = (muons["dx"]*0.3 > 3.5) & (muons["dx"]*0.3 < 50) #& (cells['pca_length']*0.3 > 150)
#clean = np.ones(shape=muons["residual_range"].shape, dtype=np.bool)
plt.hist2d(muons["residual_range"][clean]*0.3, muons["dQ"][clean]/muons["dx"][clean],
          range=[[0, 200], [500, 4000]], bins=[40, 80],
          cmap="viridis")
plt.colorbar()
plt.xlabel("Residual range (cm)")
plt.ylabel("Muon dQ/dx [arbitrary units / cm]")

Text(0, 0.5, 'Muon dQ/dx [arbitrary units / cm]')

lartpc_mlreco3d Tutorials