3 ressourcer fundet

Tags: NB

Filtrér resultater
  • The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian...
    • JSONL
  • "The Norwegian Colossal Corpus (NCC) is a collection of multiple smaller Norwegian corpuses suitable for training large language models. We have done extensive cleaning on the...
    • JSON
  • "NB-BERT-base is a general BERT-base model built on the large digital collection at the National Library of Norway. This model is based on the same structure as BERT Cased...
    • BIN