The Norwegian Colossal Corpus
URL: https://huggingface.co/datasets/NbAiLab/NCC
Dataset description:
"The Norwegian Colossal Corpus (NCC) is a collection of multiple smaller Norwegian corpuses suitable for training large language models. We have done extensive cleaning on the datasets,...
Destinationsside: The Norwegian Colossal Corpus
Yderligere information
Felt | Værdi |
---|---|
Data last updated | ubekendt |
Metadata last updated | 20. september 2022 |
Metadata oprettet | ubekendt |
Format | JSON |
Licens | Other (Open) |
Metadata oprettet | for 2 år siden |
Has views | False |
Id | e3f80172-942d-4d6a-a511-9faf19fb6ca8 |
Package id | 5955e86d-5e0d-46fd-974e-d598b33b82a7 |
Position | 0 |
State | active |