-
EN-DA Bilingual corpus made out of PDF documents from the European Medicines Agency, (EMEA), https://www.ema.europa.eu, (February 2020). Attribution details: This dataset has...
- TMX
-
Dette tosproget korpora er bygget af en række forskellige korpusser fra udvalgte offentlige og private korpus og er blevet brugt til at træne NTEU (Neural Translation for the...
- TMX
-
Contents of the Nordic Co-operation web site http://www.norden.org downloaded and converted into a parallel corpus This dataset has been created within the framework of the...
- TMX
-
Contents of https://eng.mst.dk/ and https://mst.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created...
- TMX
-
Contents of https://www.dst.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of...
- TMX
-
Contents of https://natmus.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of...
- TMX
-
Contents of https://www.odense.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework...
- TMX
-
Contents of https://slks.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of the...
- TMX
-
Contents of https://spillemyndigheden.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the...
- TMX
-
Contents of https://naturstyrelsen.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the...
- TMX
-
Contents of http://www.geus.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of...
- TMX
-
Contents of http://rigsrevisionen.dk/ website downloaded, aligned and converted into parallel corpus This dataset has been created within the framework of the European Language...
- TMX
-
Contents of https://www.visitdenmark.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the...
- TMX
-
Contents of http://www.aarhus2017.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the...
- TMX
-
Contents of https://www.visitvejle.com were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the...
- TMX
-
Contents of http://um.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of the...
- TMX
-
Contents of https://uk.fm.dk/ were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of...
- TMX
-
Contents of https://www.dma.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of...
- TMX
-
Et bibliotek med over 60.000 Gutenberg e-bøger. Læs mere om licenser og copyright her: https://www.gutenberg.org/wiki/Category:How-To
- HTML
-
Dansk etsproget korpus på 3,708,693 sætninger, med indhold scrapet fra www.retsinformation.dk. Korpusset er et stillbillede af indholdet på retsinformation og er ikke blevet...
- TXT