The paper tackles the issue of evaluating the quality of datasets for AI-based systems, by examining a specific case related to land-water segmentation of satellite images. The quality of the samples of a public dataset is automatically assessed using a new method, based on the best match with labels computed with the normalized difference water index (NDWI). Then, the assessment is used for identifying samples with labels of doubtful value, and for extracting a higher quality subset. The quality of the subset is validated by training a neural model for satellite image analysis, and by testing it also on a completely independent set of satellite images. The study, besides providing a concrete new method to check and improve datasets for water-land segmentation of satellite images, demonstrates the general importance of evaluating dataset quality.
Metrology for AI: Quality Evaluation of the SNOWED Dataset for Satellite Images Segmentation / Scarpetta, Marco; Di Nisio, Attilio; Affuso, Paolo; Spadavecchia, Maurizio; Giaquinto, Nicola. - ELETTRONICO. - (2024), pp. 1005-1009. [10.1109/metroxraine62247.2024.10797199]
Metrology for AI: Quality Evaluation of the SNOWED Dataset for Satellite Images Segmentation
Scarpetta, Marco;Di Nisio, Attilio;Affuso, Paolo;Spadavecchia, Maurizio;Giaquinto, Nicola
2024-01-01
Abstract
The paper tackles the issue of evaluating the quality of datasets for AI-based systems, by examining a specific case related to land-water segmentation of satellite images. The quality of the samples of a public dataset is automatically assessed using a new method, based on the best match with labels computed with the normalized difference water index (NDWI). Then, the assessment is used for identifying samples with labels of doubtful value, and for extracting a higher quality subset. The quality of the subset is validated by training a neural model for satellite image analysis, and by testing it also on a completely independent set of satellite images. The study, besides providing a concrete new method to check and improve datasets for water-land segmentation of satellite images, demonstrates the general importance of evaluating dataset quality.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.