Røst-315M

RØST-315M is a speech recognition model based on the CoRal-dataset, and the model is a product of the CoRal-project. CoRal is a project that aims to produce datasets that are comprehensive automatic speech recognition (ASR) datasets designed to capture the diversity of the Danish language across various dialects, accents, genders, and age groups. The primary goal of the CoRal dataset is to provide a robust resource for training and evaluating ASR models that can understand and transcribe spoken Danish in all its variations.

This model is intended to be used for Danish automatic speech recognition.

Note that Biometric Identification is not allowed using the CoRal dataset and/or derived models.

The dataset is licensed under a custom license, adapted from OpenRAIL-M, which allows commercial use with a few restrictions (speech synthesis and biometric identification). See license.

A research paper will be submitted soon, but until then, if you use the CoRal dataset in your research or development, please cite it as follows:

@dataset{coral2024, author = {Dan Saattrup Nielsen, Sif Bernstorff Lehmann, Simon Leminen Madsen, Anders Jess Pedersen, Anna Katrine van Zee and Torben Blach}, title = {CoRal: A Diverse Danish ASR Dataset Covering Dialects, Accents, Genders, and Age Groups}, year = {2024}, url = {https://hf.co/datasets/alexandrainst/coral}, }

Data og Distribution(er)

Yderligere info test

Felt Værdi
Destinationsside https://huggingface.co/alexandrainst/roest-315m
Forfatter Dan Saattrup Nielsen
Vedligeholdes af Dan Saattrup Nielsen
Metadata sidst opdateret oktober 17, 2024, 06:20 (UTC)
Metadata oprettet oktober 16, 2024, 11:52 (UTC)
Kontaktemail dan.nielsen@alexandra.dk
Kontaktnavn Dan Saattrup Nielsen
Opdateret 15-10-2024
URI https://data.gov.dk/dataset/lang/c3cd6a9c-eba5-4c9b-9078-f8583c34b9a9
Udgivelsesdato 14-10-2024
type Værktøjer og teknologi
Dokumentation