five

PESTS

收藏
arXiv2023-09-30 更新2024-06-21 收录
下载链接:
https://github.com/mohammadabdous/PESTS
下载链接
链接失效反馈
官方服务:
资源简介:
PESTS数据集是由伊朗科技大学的研究团队创建的,旨在解决波斯语和英语之间的跨语言语义文本相似性问题。该数据集包含5375对波斯语和英语句子,通过专家协作确保了语义相似性的准确标注。数据集的创建过程涉及从波斯语到波斯语的语义相似性数据集的初步构建,随后由精通两种语言的专家进行翻译和标注。PESTS数据集的应用领域广泛,包括机器翻译、信息检索和问答系统等,特别适用于需要理解两种语言间语义相似性的场景。

The PESTS dataset was created by a research team from Iran University of Science and Technology to address the cross-lingual semantic textual similarity problem between Persian and English. It includes 5375 pairs of Persian and English sentences, with accurate semantic similarity annotations guaranteed through expert collaboration. The dataset construction process initially involved the preliminary development of a monolingual Persian semantic similarity dataset, which was subsequently translated and annotated by bilingual experts proficient in both languages. The PESTS dataset has a wide range of application scenarios, including machine translation, information retrieval, question answering systems and other fields, and is particularly suitable for tasks that require understanding of semantic similarity between the two languages.
提供机构:
伊朗科技大学
创建时间:
2023-05-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作