CDT - The Copenhagen Danish-English Dependency Treebank

The Copenhagen Dependency Treebanks are a set of treebanks for Danish, English, Spanish and Italian. The purpose of the Copenhagen Dependency Treebank project is to create linguistically annotated text collections (treebanks) on the basis of the dependency-based grammar formalism Discontinuous Grammar (Buch-Kromann 2009). The treebanks created in the project can be used to train natural language parsers, syntax-based machine translation systems, and other statistically based natural language applications. The treebanks are based on a unified dependency annotation, where texts are analyzed as a single dependency structure that spans all levels of analysis, from morphology to discourse.

License: https://www.gnu.org/licenses/gpl-3.0.html

Reference: Buch-Kromann, M. (2009). Discontinuous Grammar: A Dependency-Based Model of Human Parsing and Language Learning. Saarbrücken: VDM Verlag Dr. Müller. (https://research.cbs.dk/en/publications/discontinuous-grammar-a-dependency-based-model-of-human-parsing-a)

Data og Distribution(er)

Yderligere info

Felt Værdi
Destinationsside https://github.com/mbkromann/copenhagen-dependency-treebank/wiki/CDT
Metadata sidst opdateret September 9, 2020, 07:29 (UTC)
Metadata oprettet Juni 22, 2020, 13:00 (UTC)
Emne Sprog og retskrivning Uddannelse, kultur og sport
GUID https://data.gov.dk/dataset/lang/1d755e32-2686-43ee-9a38-eef87bb63749
Kontaktemail matthias@buch-kromann.dk
Kontaktnavn Matthias Buch-Kromann
URI https://data.gov.dk/dataset/lang/1d755e32-2686-43ee-9a38-eef87bb63749
Udgivelsesdato 2011-2012
Udgivernavn Matthias Buch-Kromann
Type Værktøjer og teknologi
Dokumentation
related_resource ["https://universaldependencies.org/treebanks/da_ddt/index.html"]