Terms and conditions

Generelle betingelser

Contributors of language resources

The purpose of the portal sprogteknologi.dk is to provide easy and centralized access to metadata about Danish language resources to support the development of language technology and artificial intelligence in Danish, thus helping to ensure that the digital language in Denmark is Danish. The portal also offer the providers to display their language resources, get their organization name and logo posted at the portal as well as the capability to have their resource metadata harvested by Danish Common Public Dataportal and The European Dataportal.

The portal encourages contributors of language resources to:

  • confirm the metadata description that the portal contains about your language resources
  • confirm or inform the secretariat if there are changes in the metadata description about the language resources that you have provided on the platform
  • confirm that the portal only display metadata description about those of your language resources that are not payment imposed
  • confirm or indicate under which license your language resources may be used

What kind of access right does the user have?

The portal sprogteknologi.dk contains metadata about resources and datasets that are either public or partly public where resources and datasets only are available under certain conditions for example academic research or a specific use under special conditions. The portal does not display metadata about language resources that are payment imposed or require a payment fee for access to the language resource. 

Licenses

It is the responsibility of the resource contributor to define under which conditions the data may be used. Hence, the contributor must indicate under which license, the language resource can be assigned a reference to a license document.  It is recommended that the language resource is assigned a license that is as open as possible (eg Creative Commons). 

Ethical usage of language resources

The development of language technology solutions contains great possibilities for innovation. However, as with all uses of data, a range of ethical considera-tions regarding for instance biases in data and risks of misuse of data follows.

We urge both data providers and data users to actively attempt to avoid biases by making sure that every gender, age group, population group and so on are represented in the data and clearly stating when biases are unavoidable. At all times, users of sprogteknologi.dk are expected to use data in an ethically respon-sible way that, by no means possible, can lead to misleading or cause harm. Moreover, it should always be clear when the users are interacting with a ma-chine and not a human being.
Find more information about principles and tools for supporting data ethics for instance at Dataetisk Råd (Dataetisk Råd | Nationalt Center for Etik).

FAIR principles 

Metadata descriptions at the platform must strive to live up to the FAIR principles. By registering and displaying metadata on the portal, the findability will increase and the metadata descriptions at the platform are detailed and readable to both humans and machines.

The principles emphasize machine-action ability (i.e., the capacity of computational systems to find, access, interoperate and reuse data with none or minimal human intervention) because humans increasingly rely on computational support to deal with data as a result of the increase in volume, complexity and creation speed of data.

FAIR principles:

  • Findable
  • Accessible
  • Interoperable
  • Reusable

Learn more at 'The Go FAIR Initiative'.

Metadata descriptions

Metadata descriptions are manually added or harvested to the platform by The Danish Agency for Digitisation. Information is based on metadata descriptions of datasets or language resources made available by the different resource providers.

The metadata descriptions of language resources displayed by sprogteknologi.dk are freely available to all users, as well as reusable and re-distributable under the Creative Public Domain license (CC0 https://creativecommons.org/publicdomain/zero/1.0/).

If metadata is harvested into the portal, we encourage the contributor to inform the secretariat if there are changes in the metadata description about the language resources. If the contributor prefer a fixed update cycle instead, this can be agreed with the secretariat.

The ownership and responsibility for datasets or other language resources remains with the organization who published the specific language resource. The concerned organization remains responsible for the validity and quality of their language resource and the carries the full legal responsibility for the language resource they publish.