Guidelines for Preparing Dataset Description

Guidelines for Preparing Dataset Description

Introduction

The documentation is an essential component of the data submission process and provides a comprehensive overview of the datasets that have been collected, curated, and analyzed. The documentation should include a detailed description of the data collection methods, including the tools and equipment that were used, the geographical locations where the data was collected, and the data quality control procedures that were implemented. The documentation should also describe the data processing and analysis techniques used, including any statistical methods or machine learning algorithms that were employed. In addition, the documentation should provide information on the data management procedures that were implemented, including metadata documentation, data sharing and access policies, and data archiving and preservation procedures. A well-written and comprehensive documentation is essential to ensure that the datasets are well-documented and can be reused by other researchers in the future. This document provides guidelines on how to prepare documentation that meets the requirements for data submission and ensures the data is useful for other researchers.

Data Description

This section should provide a detailed description of the data that is being shared, including the type of data, the format, and the structure. A thorough data description is essential for others to understand and properly utilize the data. First, describe the type of data that is being shared, such as audio recordings or metadata. Then, provide information on the format of the data, including any software or tools required for data processing or analysis. Be sure to also provide a clear and concise description of the data structure, including any naming conventions or standards used to organize the data. It is recommended to also include a data dictionary or codebook that describes the variables and values in the data, along with any relevant units of measurement or codes used. Additionally, any data transformations or preprocessing steps that have been performed on the data should be clearly documented, including any filters or transformations that may have been applied. To enhance the data description, we suggest adding a table that outlines the number of files, annotated files, annotated calls, and other relevant statistics.

It is important to include certain information to ensure that the dataset is properly understood and utilized. Some of the key information that should be included in the data description section includes the following:

- Type of data (e.g. audio recordings, metadata)

- Format of the data (e.g. file format, encoding)

- Sampling rate

- Number of channels

- Bit depth

- Length of recordings

- Naming conventions or standards used to organize the data

- Data dictionary or codebook that describes the variables and values in the data

- Any data transformations or preprocessing steps that have been performed

It may also be helpful to include any relevant information about the collection and annotation process, as well as any quality control measures that were implemented. It's important for the data submitter to ensure that all relevant information is provided and clearly documented in the data description section. They should also make sure that any specific requirements or recommendations for the dataset are clearly communicated to potential users.

Data Processing

In this section, you should describe the methods and procedures used to process the raw data into a usable form. It should provide a detailed description of the data processing steps, including any cleaning, filtering, normalization, or other transformations that were performed.

Here are some guidelines to follow when writing the Data Processing section:

1. Describe the data processing steps: Provide a clear and concise description of the data processing steps that were performed. This should include details on any software, algorithms, or other tools used in the processing.

2. Explain any data cleaning or filtering steps: Describe any steps taken to clean the data, including removing noise or other artifacts, removing duplicate entries, or dealing with missing data.

3. Provide details on any normalization or transformation steps: If any normalization or transformation steps were performed on the data, provide a detailed explanation of the methods used.

4. Include any relevant code or scripts: If possible, include any code or scripts used in the data processing, so that others can reproduce the same results.

5. Be transparent: Be transparent about the data processing steps, and provide enough detail so that others can understand and replicate the process.

The Data Processing section should provide enough information for others to understand how the raw data was transformed into a usable form. By providing this information, you can ensure that others can replicate your work and use the data in their own research.

Results and Findings

This section should describe the results of the data analysis and any relevant findings. The results and findings should be presented clearly and concisely, with appropriate supporting visual aids, such as tables, graphs, and figures. The section should include a description of the statistical methods used to analyze the data and should present the results in a way that is easy to understand for the target audience. Any limitations or uncertainties associated with the results should be noted, along with any potential implications or applications of the findings. The section should also include a discussion of how the findings relate to the research questions or objectives, and how they contribute to the overall body of knowledge in the field.

If the results of the data analysis have already been published, the submitter can optionally include a summary or link to the publication in this section. They can briefly describe the main findings and their significance, and how they relate to the dataset being shared. Additionally, the submitter can discuss the future potential of the dataset for further research or applications, including any possible collaborations or partnerships. This can help to highlight the value of the dataset and encourage others to utilize it in their own work.

Overall, this section provides ideas to future users about how the dataset can be optimally used to produce useful results.

Discussion and Conclusion

In this section, you may provide an interpretation of the results and findings, placing them in the context of the animal communication research field. You may also highlight any areas for future research or potential implications for the field. If your dataset is part of a larger research project, you may discuss the overarching research questions and how this dataset fits into the larger picture.

It is important to note that this section may be optional, as it may not be applicable to all datasets. If your dataset is purely descriptive or exploratory in nature, or if the results have already been published in a separate paper, it may not be necessary to include a discussion and conclusion section. However, if you do include this section, it should be brief and to the point, highlighting the main findings and their significance in a clear and concise manner.

References

Include a list of all sources that have been cited in the text. All references should be listed in alphabetical order by the last name of the first author. The full citation should be provided, including the title of the article or book, the journal or book title, and the publication date, as well as the names of all authors and editors. If a digital object identifier (DOI) is available, it should be included in the citation. Check the formatting guidelines of the target journal or publication to ensure that the reference style is correct.

Appendices

This section should include any supplementary materials that support the main text of the write-up, but are not critical to the narrative flow. Examples of materials that may be included in the appendices are:

A. Examples of data files and tables used in the study. This can be useful for readers who want to explore the data in more detail.

B. Detailed descriptions of and analysis techniques employed. This can provide readers with a more in-depth understanding of the methodsused to generate the results.

C. Other relevant supplementary materials. This can include any additional materials that support the main text of the write-up, such as code or software used in the study.

D. An example of documentation can be found here.

All appendices should be clearly labeled and referenced in the main text of the write-up. It is important to note that appendices should not be used as a dumping ground for information that is critical to the narrative flow of the main text. Any information that is necessary for readers to fully understand the study should be included in the main text of the write-up.

Table of Contents