Extracting extended vocal units from two neighborhoods in the embedding plane