Deep Unsupervised Domain Adaptation for the Cherenkov Telescope Array

Deep learning thesis applied to astrophysics (GammaLearn)

Authors Under the supervision of
Michaël Dell'aiera (LAPP, LISTIC)
Cyann Plard (LAPP)
Thomas Vuillaume (LAPP)
Sami Caroff (LAPP)
Alexandre Benoit (LISTIC)
michael.dellaiera@lapp.in2p3.fr         cyann.plard@lapp.in2p3.fr

Presentation outline


  • Contextualisation, Deep learning & GammaLearn
  • Domain adaptation applied to simulations (previous results)
  • Domain adaptation applied to the Crab Nebula, no moonlight (previous, premilinary results)
  • Domain adaptation applied to the Crab Nebula, moonlight (premilinary results)
  • Conclusion, perspectives

Presentation outline


  • Contextualisation, Deep learning & GammaLearn
  • Domain adaptation applied to simulations (previous results)
  • Domain adaptation applied to the Crab Nebula, no moonlight (previous, premilinary results)
  • Domain adaptation applied to the Crab Nebula, moonlight (premilinary results)
  • Conclusion, perspectives

Contextualisation, Deep learning & GammaLearn


**[GammaLearn](https://purl.org/gammalearn):** * Expore and evaluate the added value of deep learning for CTA * Build a parallel method for Hillas+RF * Mikael Jacquemont's thesis: design of the **[γ-PhysNet](https://theses.hal.science/tel-03590369/)** neural network

**Main results of Mikael Jacquemont's thesis ([published](https://arxiv.org/abs/2108.04130)):** * Outperforms Hillas+RF on MC and on real data in controlled environment * But performances on real data could be improved

The challenging transition from MC to real data


* Simulations: approximations of the reality * Variation of NSB, stars, dysfunctioning pixels, fog on camera, ... * Non-trivial direct application to real data

NSB charge distributions

* [Domain adaptation](https://arxiv.org/abs/2009.00155): Set of algorithms and techniques aiming at reducing domain discrepancies * Selection, implementation and validation of [DANN](https://arxiv.org/abs/1505.07818), [DeepJDOT](https://arxiv.org/abs/1803.10081), [DeepCORAL](https://arxiv.org/abs/1607.01719), [DAN](https://arxiv.org/abs/1502.02791)

Domain adaptation

Validation pipeline of our approach


Validation of the methods
* Controlled perturbations on the simulated (labelled) datasets * Source = MC, Target = MC + perturbations
→ Validation with figures of merit (Published) * MC dataset : prod5 trans80, alt=20deg, az=180deg
Tests on real telescope acquisitions
* Source = MC, Target = Real data
→ Detection of known gamma-ray sources
Crab

No moonlight

Moonlight

6894, 6895

6892, 6893



Focus of this talk: comparing lstchain, γ-PhysNet and γ-PhysNet-DANN on real telescope acquisitions

Presentation outline


  • Contextualisation, Deep learning & GammaLearn
  • Domain adaptation applied to simulations (previous results)
  • Domain adaptation applied to the Crab Nebula, no moonlight (previous, premilinary results)
  • Domain adaptation applied to the Crab Nebula, moonlight (premilinary results)
  • Conclusion, perspectives

Domain adaptation applied to simulations: Setup


Using IRF calculation from LST Crab performance paper with: gamma efficiency=0.7

prod5 trans80, alt=20deg, az=180deg

Train

Test

Source
Labelled
Target
Unlabelled

Unlabelled
MC

ratio=50%/50%
MC+Poisson(0.46) (MC*)

ratio=50%/50%
MC+Poisson(0.46) (MC*)

γ-PhysNet + DANN


γ-PhysNet + DANN


γ-PhysNet + DANN


γ-PhysNet + DANN


γ-PhysNet + DANN


Domain adaptation applied to simulations: Results


IRFs comparing the performances of γ-PhysNet, γ-PhysNet* and γ-PhysNet-DANN (Published)

Presentation outline


  • Contextualisation, Deep learning & GammaLearn
  • Domain adaptation applied to simulations (previous results)
  • Domain adaptation applied to the Crab Nebula, no moonlight (previous, premilinary results)
  • Domain adaptation applied to the Crab Nebula, moonlight (premilinary results)
  • Conclusion, perspectives

Domain adaptation applied to the Crab Nebula: Framework


Domain adaptation applied to the Crab Nebula: Setup


Crab Nebula (No moonloght: 6894 & 6895)

Train

Test

Source
Labelled
Target
Unlabelled

Unlabelled
MC+Poisson (MC*)

ratio=50%/50%
Real data

ratio=1γ for > 1000p
Real data

ratio=1γ for > 1000p


→ lstchain is trained on the same MC*

γ-PhysNet + DANN conditionnal


γ-PhysNet + DANN conditionnal


γ-PhysNet + DANN conditionnal


γ-PhysNet + DANN conditionnal


γ-PhysNet + DANN conditionnal


Domain adaptation applied to 06895 (no moonlight)


lstchain

Domain adaptation applied to 06895 (no moonlight)


γ-PhysNet

Domain adaptation applied to 06895 (no moonlight)


γ-PhysNet-DANNc

Domain adaptation applied to 06895 (no moonlight)


lstchain
γ-PhysNet
γ-PhysNet-DANNc


Premilinary results.

Presentation outline


  • Contextualisation, Deep learning & GammaLearn
  • Domain adaptation applied to simulations (previous results)
  • Domain adaptation applied to the Crab Nebula, no moonlight (previous, premilinary results)
  • Domain adaptation applied to the Crab Nebula, moonlight (premilinary results)
  • Conclusion, perspectives

Domain adaptation applied to the Crab Nebula: Framework


Domain adaptation applied to the Crab Nebula: Setup


Crab Nebula (Moonlight: 6892 & 6893)

Train

Test

Source
Labelled
Target
Unlabelled

Unlabelled
MC+Poisson (MC*)

ratio=50%/50%
Real data

ratio=1γ for > 1000p
Real data

ratio=1γ for > 1000p


→ lstchain is trained on the same MC*

Domain adaptation applied to 06893 (moonlight)


lstchain

Domain adaptation applied to 06893 (moonlight)


γ-PhysNet

Domain adaptation applied to 06893 (moonlight)


γ-PhysNet-DANNc

Domain adaptation applied to 06893 (moonlight)


lstchain
γ-PhysNet
γ-PhysNet-DANNc


Premilinary results.

Presentation outline


  • Contextualisation, Deep learning & GammaLearn
  • Domain adaptation applied to simulations (previous results)
  • Domain adaptation applied to the Crab Nebula, no moonlight (previous, premilinary results)
  • Domain adaptation applied to the Crab Nebula, moonlight (premilinary results)
  • Conclusion, perspectives

Summary



Significance

lstchain

γ-PhysNet

γ-PhysNet-DANNc

No moonlight 20.3σ 22.4σ 22.5σ
Moonlight 20.5σ 17.9σ 19.8σ

* Moonlight: * +2σ for γ-PhysNet / γ-PhysNet-DANNc compared to lstchain * γ-PhysNet and γ-PhysNet-DANNc derive at the same results (no added value for the DANN)

* No moonlight: * +2.6σ (+0.7σ) for lstchain compared to γ-PhysNet (γ-PhysNet-DANNc) * γ-PhysNet has degraded performances on moonlight data compared to no moonlight * γ-PhysNet-DANNc partly recovers from the loss (DANN has an added value)

Summary



Crab runs

6892

6893

6894

6895

Zenith angle 16.1° 20.3° 27.9° 32.4°
Light pollution 1.94pe 1.81pe 1.64pe 1.60pe

* Optimization runs may not be accurately reflecting the analysis runs * The light pollution varies with time in real data, but remains constant in MC * Moonlight condition degrades the significance, but a higher zenith angle compensates the loss (lstchain)

Summary



Methods

lstchain

γ-PhysNet

γ-PhysNet-DANNc

Input Cleaned images All the pixels All the pixels
Training data MC* MC* MC* + Crab

* Sampling of the Crab training data * 2 runs ~ 20 million of events, 1 million of Crab events for the training (5%) * More samples could be needed

Conclusion & Perspectives


  • Novel technique to solve MC vs real data discreprency, tested on MC and Crab data, both moonlight and no moonlight conditions.
  • γ-PhysNet-DANNc demonstrates better performance compared to γ-PhysNet-DANN on real data (~+1σ)
  • γ-PhysNet strongly affected by moonlight / quality of the simulated background, but γ-PhysNet-DANNc partly recovers the loss
  • Conduct optimization on odd events and analysis and even events
  • Stability of the model: Uncertainy of the model & uncertainty over data sampling
  • Double optimization on G/H cut and θ² cut: Need for more statistics

Acknowledgments


- This project is supported by the facilities offered by the Univ. Savoie Mont Blanc - CNRS/IN2P3 MUST computing center - This project was granted access to the HPC resources of IDRIS under the allocation 2020-AD011011577 made by GENCI - This project is supported by the computing and data processing ressources from the CNRS/IN2P3 Computing Center (Lyon - France) - We gratefully acknowledge the support of the NVIDIA Corporation with the donation of one NVIDIA P6000 GPU for this research. - We gratefully acknowledge financial support from the agencies and organizations listed [here](https://www.cta-observatory.org/consortium\_acknowledgment). - This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 653477 - This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 824064