Modelli di Deep Learning per l’Analisi Scientifica in Tempo Reale del Cherenkov Telescope Array Observatory

DI PIANO, Ambra

In recent years, advancements in multi-wavelength (MWL) and multi-messenger (MM) astronomy have transformed our understanding of the universe. By combining observations across different types of radiation (from radio waves to gamma rays) and messengers (from gravitational waves to neutrinos), astronomers are able to achieve more precise measurements, deeper insights into cosmic events, and real-time monitoring of transient sources, opening new frontiers for exploration. The Cherenkov Telescope Array Observatory (CTAO) will provide incredible opportunities for the future of ground-based very-high-energy gamma-ray astronomy. To optimise its scientific output, the CTAO will have a Science Alert Generation (SAG) system, which as part of the Array Control and Acquisition (ACADA) system will perform reconstruction, data quality monitoring and scientific analysis in real-time to detect and issue candidate science alerts. As part of the continuous research and development activity for improvements of future versions of the ACADA/SAG product, this work aims at implementing deep learning (DL) enhancements for the scientific analysis. In real-time technical and observational variability, as well as performance requirements can highly impact the overall sensitivity of the automated pipelines. I developed two Convolutional Neural Network based models as prototypes tools, with the aim of performing scientific analyses without any required a priori knowledge that standard scientific tools depends upon, such as the target position, background template or instrument response functions (IRFs). The first model is an autoencoder trained to remove background noise from counts maps of a given observation, without requiring any inputs on target position, background templates or IRFs. The second model is a regressor that extracts the coordinates of the brightest candidate source in the field of view, without requiring inputs on the background template or IRFs. I used the current version of ACADA/SAG (rel1) as reference to verify my results. The evaluation of the autoencoder was done by computing the distribution of the differences between the remaining source counts after applying the CNN denoising and the photometric source excess computed form the original data. The evaluation for the regressor it was done by computing the angular separation between the simulated and the found coordinates for both the standard and CNN methods. On a random zenith angle dataset the source excess difference distribution has a mean of 2 counts and variance 8 counts at 1 sigma level for the random zenith angle dataset, where if both methods were to behave perfectly identical we would have null mean and variance. This proves that even if the CNN has none of the background information used by the standard method, it can achieve accurate results with minimal margins of source counts loss. On the same validation dataset the localisation using gammapy and CNN achieve a 68% containment radius of 0.04 degrees with a 0.004 degrees of error using gammapy, and 0.07 degrees with a 0.004 degrees of error using the CNN. This again proves that the CNN model can achieve accurate results while lacking the amount of a priori knowledge utilised by the standard tool. These results must consider, though, that gammapy has the exact knowledge of the simulated background, while the CNN performs blind. The capability of deep learning models to learn directly from the data without requiring any external assumptions is of extreme benefit for the real-time analysis context, especially since accurate information on the target, background and instrument response functions can sometimes be hard or impossible to obtain.

Negli ultimi anni, i progressi dell'astronomia multi-wavelength (MWL) e multi-messenger (MM) hanno trasformato la nostra comprensione dell'universo. Combinando osservazioni a diverse lunghezze d'onda di radiazioni (dalle onde radio ai raggi gamma) e con diversi messaggeri (dalle onde gravitazionali ai neutrini), gli astronomi sono in grado di ottenere misurazioni più precise, approfondire eventi cosmici e monitorare in tempo reale sorgenti transienti, aprendo nuove frontiere per l'esplorazione. Il Cherenkov Telescope Array Observatory (CTAO) offrirà incredibili opportunità per il futuro dell'astronomia gamma, in particolare per lo studio delle altissime energie da terra. CTAO disporrà di un sistema automatico per la generazione di allerte scientifiche al fine di ottimizzare il suo ritorno scientifico (Science Alert Generation, SAG), sistema che sarà parte del più ampio software di controllo e acquisizione dati dell'intero osservatorio (Array Control and Acquisition, ACADA). La SAG sarà incaricata della ricostruzione del dato Cherenkov, del monitoraggio della qualità dei dati e dell'analisi scientifica in tempo reale. Questo lavoro di tesi si inserisce nell'ambito della continua attività di ricerca e sviluppo per il miglioramento delle future versioni del prodotto ACADA/SAG, con particolare focus sull'analisi scientifica di alto livello. Lo scopo è di migliorare l'applicabilità e sensitività delle pipeline di detection per la ricerca ed il follow-up di fenomeni transienti e/o scoperte serendipite, e contemporaneamente rompere la dipendenza che le tecniche standard hanno dalla conoscenza a priori sulla sorgente, il template del background o le funzioni di risposta dello strumento (Instrument Response Functions, IRFs). Ho sviluppato due modelli di Convolutional Neural Network (CNN), il primo è un autoencoder addestrato per effettuare la sottrazione del background dalle mappe del cielo ed il secondo è un regressore 2-dimensionale per effettuare la localizzazione di candidate sorgenti. Per valutare i risultati ottenuti ho utilizzato come analisi di riferimento la versione corrente di ACADA/SAG (rel1). La valutazione dell'autoencoder è stata fatta calcolando la distribuzione delle differenze tra l'eccesso di segnale della sorgente dato dalla CNN rispetto alla fotometria, mentre la valutazione del regressore è stata fatta calcolando la separazione angolare tra la sorgente simulata e le coordinate della candidata trovate dal metodo standard e dalla CNN. La distribuzione della differenza di eccessi ha una media di circa 2 conteggi, con una varianza ad 1 sigma di circa 8 conteggi per un dataset di angolo zenith casuale, dove se i metodi performassero in modo perfettamente identico avremmo invece una media e varianza nulle. Sullo stesso dataset di vaidazione, la localizzazione usando gammapy ottiene un raggio di contenimento al 68% pari a circa 0.04 gradi, con un errore di 0.004 gradi, mentre la CNN ottniene circa 0.07 gradi con un errore di 0.004 gradi. Questo risultato deve considerare che gammapy ha l'esatta conoscenza del background simulato, mentre la CNN viene applicata senza assunzioni. Questi risultati dimostrano il vantaggio apportato dalle tecniche di deep learning in sinergia alle tecniche di analisi standard, potendo aggirarne le limitazioni dovute ad assunzioni iniziali qualora non si posseggano precise informazioni riguardo al target, il background o le IRF.