===== Dataset Information Page =====

Our Dataset Information Page provides users with detailed information about specific datasets available from the Swiss National Centre of Competence in Research (NCCR) Evolving Language. We offer a wide range of datasets related to the study of language, including audio recordings, transcribed speech, and linguistic annotation, among others. Our datasets are available in a variety of formats, such as wav, mp4, toolbox, elan, xml, cha, sqlite3, and RData, making them accessible to a wide range of researchers.

Our goal is to provide users with the information they need to make informed decisions about using our datasets. The Dataset Information Page is designed to be user-friendly, with clear and concise information that is well-structured and visually appealing. We provide an overview of each dataset, its features, and the data it contains so that users can easily understand how the data fits into their research program. We also offer a search bar that allows users to quickly find specific datasets they are interested in, and an API for each dataset that enables programmatic access to the data.


^ [[AUTOTYP|AUTOTYP]] ^


  * AUTOTYP is a large-scale research program with goals in both quantitative and qualitative typology.
  * The data type is text database. The data formats supported include RData, JSON, csv, yaml, CLDF.
  

^ [[ACQDIV corpus|ACQDIV corpus]] ^

  * The goal of the ACQDIV project is to identify universal cognitive processes that enable language acquisition despite the substantial cross-linguistic variation found in the world’s languages.
  * The dataset includes video/audio recordings, transcribed speech, and linguistic annotation. The data formats supported include wav, mp4, toolbox, elan, xml, cha, sqlite3, and RData.


^ [[Vocal Communication During Pair Formation|Vocal Communication During Pair Formation]]: ^


  * The dataset includes 12–14-day long audio, video, and wireless, animal-borne accelerometer recordings of mixed sex pairs of zebra finches. The data type is audio recordings, accelerometer recordings, and video recordings. The data formats supported include h5 and mp4.
  

^^ [[Songbird Vocal Segments|Songbird Vocal Segments]] ^^

  * The dataset contains complete day-long audio recording sets of single male zebra finches (Taeniopygia guttata) at different developmental stages, recorded in sound-proof isolation chambers. The recording was triggered by vocalizations (or other sounds), thus, the recordings are unevenly spaced in time depending on the activity of the bird and each recording/file contains vocalizations with some silence before and after the vocalizations.
  * The data is available in .wav and .lvd formats (with .txt files for metadata).