====== Extracting extended vocal units from two neighborhoods in the embedding plane ====== * **ID:** 20231012155259780-1257 * **Researcher:** Corinna Lorenz, Xinyu Hao, Tomas Tomka, Linus Rüttimann, Richard H. R. Hahnloser * **WP:** Other * **PI:** null * **Abstract:** Annotating and proofreading data sets of complex natural behaviors are tedious tasks because instances of a given behavior need to be correctly segmented from background noise and must be classified with minimal false positive error rate. Low-dimensional embeddings have proven very useful for this task because they provide a visually appealing overview of a data set in which relevant clusters appear spontaneously. However, low-dimensional embeddings introduce errors because they fail to preserve high dimensional distances; and embeddings represent only objects of fixed dimensionality, which conflicts with natural objects such as vocalizations that have variable dimensions stemming from their variable durations. To mitigate these issues, we introduce a semi-supervised method for simultaneous segmentation and clustering of vocalizations. We define vocal units of a given type in terms of two density-based regions in low-dimensional embedding space, one associated with onsets and the other with offsets. We demonstrate our approach on the task of clustering adult zebra finch vocalizations embedded into the 2d plane with UMAP. We show that two-neighborhood (2N) extraction allows the identification of short and long vocal renditions from continuous data streams without initially committing to a particular segmentation of the data. Also, 2N vocal extraction achieves much lower false positive error rate than approaches based on a single defining region. * **Publication DOI:** [[https://doi.org/10.1101/2022.09.26.509501|https://doi.org/10.1101/2022.09.26.509501]] * **Publication Link:** [[https://www.biorxiv.org/content/10.1101/2022.09.26.509501v1|https://www.biorxiv.org/content/10.1101/2022.09.26.509501v1]] * **Data Type:** null * **Data Format:** null * **Git:** [[None|None]]