# Data Science for Physicists

Welcome to the Short Course on Data Science for Physicists presented at APS-Global Physics Summit 2026! This Jupyter Book provides comprehensive tutorials on applying deep learning techniques to experimental nuclear physics, specifically for the Forward Calorimeter (FCAL) at GlueX in Hall-D Jefferson Lab.

## Overview

These tutorials are designed for Nuclear and Particle Physics and Researchers and cover different applications of deep learning:

1. **GNN-based Classification and Regression** (2026): Using Convolutional Neural Networks to classify FCAL showers and distinguish between photons and splitoffs; Regression of the true energy of a reconstructed (identified) photon.  

2. **CNN-based Classification** (2025): Using Convolutional Neural Networks to classify FCAL showers and distinguish between photons and splitoffs

3. **Generative AI** (2025): Building generative models to simulate FCAL photon showers based on their kinematics.

## Tutorial Goals

By the end of these tutorials, you will be able to:

- Understand the physics behind FCAL showers and the importance of accurate classification
- Prepare and preprocess FCAL data for deep learning applications
- Build and train deep learning models for binary classification (photons vs splitoffs)
- Evaluate model performance using physics-informed metrics
- Apply these techniques to your own nuclear physics research

## Target Audience

These tutorials are intended for:
- Graduate students and postdocs in nuclear physics
- Researchers working on calorimeter systems
- Scientists interested in applying ML to experimental physics
- Anyone wanting to learn about deep learning in the context of particle physics

## Prerequisites

- Basic understanding of nuclear physics and particle detectors
- Python programming experience
- Familiarity with NumPy and basic data analysis
- (Optional) Prior exposure to machine learning concepts

## Dataset and models

Dataset used in this tutorial: [![Hugging Face Dataset](https://img.shields.io/badge/HuggingFace-Dataset-blue.svg?logo=huggingface)](https://huggingface.co/datasets/AI4EIC/DNP2025-tutorial)

The trained models are archived for easy access and reproducibility.
You can explore and download them at: [![Hugging Face Model](https://img.shields.io/badge/HuggingFace-Model-orange.svg?logo=huggingface)](https://huggingface.co/AI4EIC/DNP2025-tutorial)


## Event Information

**APS GPS 2026 Tutorial Session: Data Science for Physicists**  
Date: March 14-15, 2026  
Location: Denver, CO  
Workshop Sessions: [1WD](https://summit.aps.org/events/MAR-SH01), [2WD](https://summit.aps.org/events/MAR-SH04)

![](./images/gps2026_header_aps.png)

For questions or feedback, please open an issue on our [GitHub repository](https://ai4eic.github.io/APS2026_GPS_tutorials).

---


```{admonition} Authors Acknowledgements to GlueX
:class: dropdrown
* This tutorial is part of the AI4EIC consortium effort to bring modern machine learning techniques to experimental nuclear physics.

* We gratefully acknowledge the [GlueX Collaboration](https://gluex.org/) for the software framework and the public release of the Monte Carlo simulation data used in this work [GlueX acknowledgements](https://gluex.org/thanks.html).
```

```{tableofcontents}
```
