Sparse Submanifold Convolutional Neural-Network (SSCN)

Posted on Thu 14 March 2019 in news by Kazuhiro Terao

ACAT 2019 Conference

We just attended ACAT2019 conference at Saas-Fee, Switzerland.

Yes, skiing there was awesome.

But the conference was also lots of fun: mainly LHC physicists but also some diversity: we saw Daniel from SLAC, Danilo from Google DeepMind (an inventor of VAE!) and Soumith from Facebook AI Research (yes, who answered your pytorch question). Take a look at all talks posted on the website!

Laura gave a talk about her recently published paper about applying Sparse Submanifold Convolutional Neural Networks (SCN) on a high resolution 3D particle images from our newly made open data. SCN is a way to address scalability challenge of applying CNNs on a large-scale particle imaging detectors, such as liquid argon time projection chamber (LArTPC) experiments including SBN program and DUNE. We summarize Laura's work below on this topic.

Sparse Submanifold Convolution

Convolutional neural networks (CNNs) are great for image analysis. There are two challenges in applying CNNs for scientific imaging data.

  1. Lots of image data in science are sparse, carrying null-pixels with no information.
  2. Data is big: imaging hardwares continue to advance, producing giga-pixel images at high rate.

A big data mostly filled with zero causes two issues to apply CNNs: wasted computational resource and degradation of performance. In photographs, for which CNNs are originally developed, there are no useless pixels. The fact that we see blue sky in the background of a flying bird helps us to identify the subject as a bird.


SCN addresses these challenge by introducing convolution operations that only take into account for non-zero pixels in the input and feature tensors outputed from the subsequent layers. This avoids waste of computational resource: the resource usage scales almost by the number of non-zero pixels instead of the total number of pixels in the volume. Yet, the convolution operations remains the same to effectively learn topological features of the object in the space.

This is particularly powerful for LArTPCs which data is generally sparse but locally dense (i.e. particle trajectories have no gap). Because of locally dense nature, down-sampling operations are known to hurt the performance of an algorithm for LArTPC data. Yet, particle trajectories are mostly 1D lines and result in lots of null-pixels when represented in 2D/3D images.

3D SCM for LArTPCs

Due to the issues discussed above, analyzing 3D particle images using a dense (i.e. standard) CNN has been almost impossible before SCN. In the paper, Laura used U-ResNet architecture to apply SCN for 3D semantic segmentation of five particle types using our open data.

Here's a result of a comparison study for the computational resource cost between the dense CNN vs. SCN

... yep, we could run SCN even on CPU and it's not so bad actually. Here's another way to look at this: how much memory is required to process a chunk of image data at once? Although the paper does not report a study using a large detector image, a relevant question is how many images can we stuck in a mini-batch to process using a GPU at a time.

We see that SCN version can handle the numbr of pixels equivalent to the whole MicroBooNE detector at less than 1GB memory usage. The ICARUS data, x6 larger than MicroBooNE, can easily fit in a conventional GPU card. Moreover, because the resource usage scale with non-zero pixel counts, x1000 times larger DUNE far detector (10kton, compared to roughly 100ton of MicroBooNE) can fit with less resource usage (both MicroBooNE and ICARUS have lots of cosmic-ray particles passing through as the detectors are on the surface). The SCN will be an ideal solution for DUNE near detector which will contain comparable number of particles tracks to MicroBooNE and ICARUS.

Reconstruction of Michel Electrons

The perforamnce reported in the paper is great, and five particle segmentation allows us to implement simple yet interesting physics study immediately. In her presentation and the paper, Laura demonstrated a simple DBSCAN algorithm to cluster the primary ionization of a Michel electron, a decay product of a muon. The reason why clustering is limited to the primary ionization part is because radiated gamma-rays are classified as EM-showers in this study. Here's a plot that compares number of pixels reconstructed (i.e. clustered using DBSCAN on U-ResNet output) vs. the number of pixels in the corresponding true cluster.

It shows a nice linear relationship, with some scattered populations below y=x, which indicates under-clustering due to mistakes by U-ResNet. Laura quantified four performance metrics: How much fraction of true Michel electrons are reconstructed? (Michel ID efficiency) How mich fraction of reconstructed Michel electrons are true Michels? (Michel ID purity) How much fraction of a true Michel electron pixels are clustered in a reconstructed Michel? (Clustering efficiency) How much fraction of a reconstructed Michel cluster originate from a true Michel electron? (Clustering purity)

... and here's the results:

These numbers are really good compared to some published results from the existing experiment. Quoting one of them, a typical ID efficiency and purity are 2% and 80-90% respectively though we should note typically analysis focuses to make higher purity by sacrificing efficiency for Michel electron reconstruction. Nevertheless, this result is compelling.

The left plot should be taken with a warning: although the x-axis is in MeV scale, the reconstruction does not include energy reconstruciton. The only relevant demonstration here is the performance of clustering, which is considered the most challenging part of Michel energy reconstruction. It is probably better presented in terms of clustered pixel counts instead of MeV. Nevertheless, this is a very promising results.


The paper lists everything you need to reproduce the results including a software container, open data set link, and our implementation of SCN. The SCN works great for LArTPCs, and should be used in all LArTPC experiments IMO where CNN is employed :)