The next generation of gravitational-wave observatories will operate in a regime where overlapping signals become inevitably common. For third-generation detectors such as the Einstein Telescope, this includes the simultaneous observation of multiple compact-binary coalescences occurring within the same time and frequency bands. In this scenario, the strain measured by the detectors will contain a continuous mixture of signals, making the statistical inference of source parameters significantly more complex than in the single-event regime explored so far in gravitational wave science. Addressing this problem requires new methodologies capable of extracting physical information directly from dense and correlated data streams. This thesis presents a deep-learning framework for the analysis of overlapping gravitational-wave signals, specifically binary black holes. Its core combines a newly developed Transformer encoder, KENN, specifically designed in the context of this work, with a Normalizing Flow head, HYPERION, integrated to form an end-to-end architecture that performs likelihood-free inference on multichannel strain data. Transformers are used here to capture long-range temporal correlations across detectors and extract the relevant information from the strain time-serie, while Normalizing Flows provide a representation of the full posterior distribution. The training is supported by a dedicated dataset generator, developed as part of this thesis, which simulates gravitational-wave signals and detector noise dynamically during learning. This design removes the need for precomputed datasets, allows the continuous creation of new examples, and effectively prevents overfitting by exposing the network to an ever-changing distribution of inputs. Beyond its technical role, the generator constitutes an independent contribution of this work, offering a modular and reusable infrastructure for the simulation of realistic data from different detectors and source populations. The proposed model was applied to the simultaneous inference of three overlapping binary black hole mergers in Einstein Telescope–like data. It successfully recovered the intrinsic parameters of each source, achieving typical errors on the chirp mass and coalescence time below 10-20% with stable performance across different overlap configurations and signal-to-noise ratios. The complete posterior reconstruction, based on 10^4 samples, requires approximately one second on a Dell PowerEdge R7425 machine equipped with a 64-core AMD EPYC CPU and one NVIDIA A30 GPUs. This computational efficiency makes the framework well suited for the large-volume data expected from third-generation detectors. On top of this foundation, the same deep-learning architecture was extended to two complementary tasks. The first was a probabilistic classifier for estimating the number of overlapping signals within each data segment. Using the same KENN-HYPERION structure, the network produced discrete posteriors over possible signal counts, achieving high accuracy in distinguishing between noise, single events, and overlapping binaries, although limited to at most two concurrent signals and Gaussian noise. The second extension tried to address the problem of source localization. Here, the architecture was applied to the inference of sky coordinates, and the task remained challenging despite extensive training and tuning. Localization is achieved only in the simplest configuration, corresponding to training with fixed intrinsic parameters. Controlled tests suggest that performance is affected by model expressivity constraints, but no single dominant cause was isolated. A dedicated investigation led to the introduction of a new embedding for the Transformer architecture (the channel embedding, which mitigated information loss and improved the overall stability of the model, although full localization remained beyond reach. These analyses clarified both the limitations and the potential of the proposed approach, outlining the path toward future improvements based on more expressive inference models and deeper attention-based architectures. Such developments, however, will require significantly more powerful hardware resources. The work presented in this thesis constitutes one of the first systematic applications of Transformer architectures to gravitational-wave data analysis and the first successful and non-biased attempt to perform a multiple parameter estimation of overlapped signals. It demonstrates that attention-based models can learn a physically meaningful representation of detector data and combined with Normalizing Flows it can deliver fast, reliable inference in complex multi-signal scenarios. These findings establish the basis for future likelihood-free, data-driven analysis frameworks that will be essential for operating in the environment of third-generation detectors.
Overlapping gravitational-wave signals in next-generation detectors: a deep-learning approach with Transformers and Normalizing Flows
PAPALINI, LUCIA
2026
Abstract
The next generation of gravitational-wave observatories will operate in a regime where overlapping signals become inevitably common. For third-generation detectors such as the Einstein Telescope, this includes the simultaneous observation of multiple compact-binary coalescences occurring within the same time and frequency bands. In this scenario, the strain measured by the detectors will contain a continuous mixture of signals, making the statistical inference of source parameters significantly more complex than in the single-event regime explored so far in gravitational wave science. Addressing this problem requires new methodologies capable of extracting physical information directly from dense and correlated data streams. This thesis presents a deep-learning framework for the analysis of overlapping gravitational-wave signals, specifically binary black holes. Its core combines a newly developed Transformer encoder, KENN, specifically designed in the context of this work, with a Normalizing Flow head, HYPERION, integrated to form an end-to-end architecture that performs likelihood-free inference on multichannel strain data. Transformers are used here to capture long-range temporal correlations across detectors and extract the relevant information from the strain time-serie, while Normalizing Flows provide a representation of the full posterior distribution. The training is supported by a dedicated dataset generator, developed as part of this thesis, which simulates gravitational-wave signals and detector noise dynamically during learning. This design removes the need for precomputed datasets, allows the continuous creation of new examples, and effectively prevents overfitting by exposing the network to an ever-changing distribution of inputs. Beyond its technical role, the generator constitutes an independent contribution of this work, offering a modular and reusable infrastructure for the simulation of realistic data from different detectors and source populations. The proposed model was applied to the simultaneous inference of three overlapping binary black hole mergers in Einstein Telescope–like data. It successfully recovered the intrinsic parameters of each source, achieving typical errors on the chirp mass and coalescence time below 10-20% with stable performance across different overlap configurations and signal-to-noise ratios. The complete posterior reconstruction, based on 10^4 samples, requires approximately one second on a Dell PowerEdge R7425 machine equipped with a 64-core AMD EPYC CPU and one NVIDIA A30 GPUs. This computational efficiency makes the framework well suited for the large-volume data expected from third-generation detectors. On top of this foundation, the same deep-learning architecture was extended to two complementary tasks. The first was a probabilistic classifier for estimating the number of overlapping signals within each data segment. Using the same KENN-HYPERION structure, the network produced discrete posteriors over possible signal counts, achieving high accuracy in distinguishing between noise, single events, and overlapping binaries, although limited to at most two concurrent signals and Gaussian noise. The second extension tried to address the problem of source localization. Here, the architecture was applied to the inference of sky coordinates, and the task remained challenging despite extensive training and tuning. Localization is achieved only in the simplest configuration, corresponding to training with fixed intrinsic parameters. Controlled tests suggest that performance is affected by model expressivity constraints, but no single dominant cause was isolated. A dedicated investigation led to the introduction of a new embedding for the Transformer architecture (the channel embedding, which mitigated information loss and improved the overall stability of the model, although full localization remained beyond reach. These analyses clarified both the limitations and the potential of the proposed approach, outlining the path toward future improvements based on more expressive inference models and deeper attention-based architectures. Such developments, however, will require significantly more powerful hardware resources. The work presented in this thesis constitutes one of the first systematic applications of Transformer architectures to gravitational-wave data analysis and the first successful and non-biased attempt to perform a multiple parameter estimation of overlapped signals. It demonstrates that attention-based models can learn a physically meaningful representation of detector data and combined with Normalizing Flows it can deliver fast, reliable inference in complex multi-signal scenarios. These findings establish the basis for future likelihood-free, data-driven analysis frameworks that will be essential for operating in the environment of third-generation detectors.| File | Dimensione | Formato | |
|---|---|---|---|
|
PhDThesis_Papalini_ETD_2.pdf
embargo fino al 30/03/2029
Licenza:
Creative Commons
Dimensione
18.6 MB
Formato
Adobe PDF
|
18.6 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/364136
URN:NBN:IT:UNIPI-364136