From citizen science to AI models: Advancing cetacean vocalization automatic detection through multi-annotator campaigns

Continuous underwater Passive Acoustic Monitoring (PAM) has emerged as a strong tool for cetacean research. To handle the vast volume of collected data, it is essential to employ automated detection and classification methods. The recent advancement of deep learning, involving model training and testing, requires a large amount of labeled data. These labels are derived through the manual annotation of audio files often reliant on human experts. Based on an annotation campaign focusing on blue whale calls in the Indian Ocean involving 19 novice annotators and one expert in bioacoustics, this study explores the integration of novice annotators in marine bioacoustics research, through citizen science programs, which could drastically increase the size of labeled datasets and enhance the performance of detection and classification models. The analysis reveals distinctive annotation profiles influenced by the complexity of vocalizations and the annotators' strategies, ranging from conservative to permissive. To address the challenges of annotation discrepancies, Convolutional Neural Networks (CNNs) are trained on annotations from both novices and the expert. The results show variations in model performance. Our work highlights the importance of annotation guidelines encouraging a more conservative approach to improve overall annotation quality. In an effort to optimize the potential of multi-annotation and mitigate the presence of noisy labels, two annotation aggregation methods (majority voting and soft labeling) are proposed and tested. The results demonstrate that both methods, particularly when a sufficient number of annotators are involved, significantly improve model performance and reduce variability: the standard deviation of the area under PR and ROC curves fall under 0.02 for both vocalizations with 13 aggregated annotators, while it was at 0.17 and 0.21 for the Blue Whale Dcalls and 0.05 and 0.04 for the SEIO PBW vocalizations with all annotators separately. Moreover, these aggregation methods enable the training of models using non-expert annotations that achieve performance of models trained with expert annotations. These findings suggest that crowdsourced annotations from novice annotators can be a viable alternative to expert annotations.

Mots clés

artificial neural network bioacoustics biomonitoring cetacean detection method ecological modeling underwater environment vocalization whale

Domaines

Sciences de l'ingénieur [physics]

Fichier principal

1-s2.0-S1574954124001845-main.pdf (4)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

Gabriel Dubus : Connectez-vous pour contacter le contributeur

https://ensta-bretagne.hal.science/hal-04614603

Soumis le : mercredi 20 novembre 2024-15:33:39

Dernière modification le : mercredi 27 novembre 2024-17:34:42

Dates et versions

hal-04614603 , version 1 (20-11-2024)

Identifiants

HAL Id : hal-04614603 , version 1
DOI : 10.1016/j.ecoinf.2024.102642

Citer

Gabriel Dubus, Dorian Cazau, Maëlle Torterotot, Anatole Gros-Martial, Paul Nguyen Hong Duc, et al.. From citizen science to AI models: Advancing cetacean vocalization automatic detection through multi-annotator campaigns. Ecological Informatics, 2024, 81, pp.102642. ⟨10.1016/j.ecoinf.2024.102642⟩. ⟨hal-04614603⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-BREST ENSTA-BRETAGNE CNRS ENSTA-BRETAGNE-STIC LAB-STICC_UBO IJLRDA ENIB LAB-STICC UNIV-ROCHELLE SORBONNE-UNIVERSITE SU-SCIENCES INRAE LAB-STICC_M3 LAB-STICC_OSE LAB-STICC_IAOCEAN LAB-STICC_SHARP RESEAU-EAU INEE-CNRS

66 Consultations

43 Téléchargements