Water quality monitoring is a critical aspect for ensuring environmental protection, industrial safety, and public health. Traditional laboratory based methods, though accurate, are often slow, expansive, and labor-intensive, making them unsuitable for real-time decision making in rapidly changing environments. This thesis presents a novel methodology for water quality assessment that leverages Ultraviolet-Visible (UV-Vis) spectroscopy combined with machine learning to develop soft sensing systems for real-time monitoring. The research focuses on two main applications: industrial wastewater from the highly polluting leather industry, and drinking water quality, where ensuring safety and regulatory compliance is fundamental. In industrial contexts, the aim is to predict key water quality indicators such as Chemical Oxygen Demand (COD), Total Suspended Solids (TSS), and chlorides in real time, while in drinking water contexts, the focus is on parameters such as Total Organic Carbon (TOC), volatile organic compounds, metals, anions, cations, and microbiological parameters. Key innovations of this research include robust preprocessing techniques to enhance data integrity and optimize model performance. Additionally, sophisticated feature extraction methods are developed, incorporating statistical measures, peak-based features, slope-based features, and Area Under the Curve (AUC) calculations to capture meaningful spectral information. The core of this work involves developing soft sensors that integrate UV-Vis spectroscopic data with machine learning models. A significant challenge addressed by this research is the limited availability of high-quality training data, particularly in highly polluted industrial environments. This issue is tackled using Conditional Generative Adversarial Networks (CGAN) for data augmentation. The results show significant improvements in predictive performance when synthetic data are used, demonstrating the potential of CGANs to supplement real datasets effectively. Furthermore, the research develops time series prediction models employing methods like one dimensional-Convolutional Neural Networks (1D-CNNs) and Echo State Networks (ESN) to forecast water quality indicators effectively, enhancing proactive monitoring capabilities. Moreover, to promote transparency and stakeholder adoption, techniques such as random forest feature importance and SHapley Additive exPlanations (SHAP) are employed to improve the interpretability of the machine learning models, providing insights into which spectral features are most important for predicting specific water quality parameters. Overall, this research confirms the viability of soft sensing technologies combined with machine learning for automated, real-time water quality monitoring.
Soft Sensors and Machine Learning for Automatic Water Quality Assessment
CARDIA, MARCO
2025
Abstract
Water quality monitoring is a critical aspect for ensuring environmental protection, industrial safety, and public health. Traditional laboratory based methods, though accurate, are often slow, expansive, and labor-intensive, making them unsuitable for real-time decision making in rapidly changing environments. This thesis presents a novel methodology for water quality assessment that leverages Ultraviolet-Visible (UV-Vis) spectroscopy combined with machine learning to develop soft sensing systems for real-time monitoring. The research focuses on two main applications: industrial wastewater from the highly polluting leather industry, and drinking water quality, where ensuring safety and regulatory compliance is fundamental. In industrial contexts, the aim is to predict key water quality indicators such as Chemical Oxygen Demand (COD), Total Suspended Solids (TSS), and chlorides in real time, while in drinking water contexts, the focus is on parameters such as Total Organic Carbon (TOC), volatile organic compounds, metals, anions, cations, and microbiological parameters. Key innovations of this research include robust preprocessing techniques to enhance data integrity and optimize model performance. Additionally, sophisticated feature extraction methods are developed, incorporating statistical measures, peak-based features, slope-based features, and Area Under the Curve (AUC) calculations to capture meaningful spectral information. The core of this work involves developing soft sensors that integrate UV-Vis spectroscopic data with machine learning models. A significant challenge addressed by this research is the limited availability of high-quality training data, particularly in highly polluted industrial environments. This issue is tackled using Conditional Generative Adversarial Networks (CGAN) for data augmentation. The results show significant improvements in predictive performance when synthetic data are used, demonstrating the potential of CGANs to supplement real datasets effectively. Furthermore, the research develops time series prediction models employing methods like one dimensional-Convolutional Neural Networks (1D-CNNs) and Echo State Networks (ESN) to forecast water quality indicators effectively, enhancing proactive monitoring capabilities. Moreover, to promote transparency and stakeholder adoption, techniques such as random forest feature importance and SHapley Additive exPlanations (SHAP) are employed to improve the interpretability of the machine learning models, providing insights into which spectral features are most important for predicting specific water quality parameters. Overall, this research confirms the viability of soft sensing technologies combined with machine learning for automated, real-time water quality monitoring.File | Dimensione | Formato | |
---|---|---|---|
phd_thesis_def_etd.pdf
accesso aperto
Dimensione
5.72 MB
Formato
Adobe PDF
|
5.72 MB | Adobe PDF | Visualizza/Apri |
report_activities.pdf
non disponibili
Dimensione
46.17 kB
Formato
Adobe PDF
|
46.17 kB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/216557
URN:NBN:IT:UNIPI-216557