-
Binary wordlists for the CST lemmatizer as suplement to the rules of the lemmatizer. Works with both tagged and untagged input. Use: cstlemma -d NAME-OF-WORDLIST
- HTML
-
The Copenhagen Dependency Treebanks are a set of treebanks for Danish, English, Spanish and Italian. The purpose of the Copenhagen Dependency Treebank project is to create...
- TAG
- ATAG
-
The Digital Corpus of the European Parliament (DCEP) contains the majority of the documents published on the European Parliament's official website. It comprises a variety of...
- XML
- SGML
- TXT
-
The DK-CLARIN JRC-Acquis Parallel Corpus (da, en) is a part of the JRC-Acquis mulilingual parallel corpus, containing documents from The Acquis Communautaire (AC) which is the...
- XML