Estonian Language Data Infrastructure
收藏re3data.org2024-05-31 收录
下载链接:
https://www.re3data.org/repository/r3d100011941
下载链接
链接失效反馈官方服务:
资源简介:
Estonian Language Data Infrastructure (Keeleandmete Teadustaristu - KeTa) is a research infrastructure that supports R&D activities that use language data. It provides services for collecting, preserving, making accessible, and reusing Estonian language data both as datasets and through various digital tools. KeTa's mission is to offer a comprehensive infrastructure and services that meet the demands and challenges of the fields of linguistics and language technology. KeTa brings together the resources of Estonian universities, research and development institutions, and other organizations related to linguistics to support and promote the use and development of language technology. KeTa plays a crucial role in aggregating high-quality language datasets for the development and evaluation of large language models.
爱沙尼亚语言资源中心(CELR)的宗旨在于构建并管理一个基础设施,旨在使爱沙尼亚语言的数字化资源(包括词典、语料库——文本与语音均涵盖——以及多样化的语言数据库)以及语言技术工具(软件)能够为所有从事数字语言材料工作的相关人员所共享。CELR负责协调和组织资源的文档编制与归档,同时致力于制定语言技术标准,并草拟适用于不同类型用户(包括公共、学术、商业等)所需的法律合同与许可证。此外,CELR除了收集语言资源之外,还将启动一个系统,用于向潜在用户介绍资源、提供信息以及开展教育。CELR的主要用户群体包括爱沙尼亚研发机构的学者以及通过欧洲类似中心的CLARIN ERIC网络连接的全球社会科学与人文科学研究者。数据访问可通过以下站点实现:公共仓库 https://entu.keeleressursid.ee/public-document,语言资源 https://keeleressursid.ee/en/resources/corpora,以及MetaShare CELR https://metashare.ut.ee/。
提供机构:
Keeleandmete Teadustaristu



