five

English-Persian parallel corpus

收藏
catalogue.elra.info2017-07-03 更新2025-03-22 收录
下载链接:
https://catalogue.elra.info/en-us/repository/browse/ELRA-W0118/
下载链接
链接失效反馈
官方服务:
资源简介:
The English-Persian parallel corpus contains more than 200,000 aligned sentences across a variety of text types from the domains of art, law, culture, science, religion, literature, medicine, idioms, politics and others. It is an extension of the English-Persian parallel corpus already distributed by ELRA (Catalogue Reference: ELRA-W0051). This new version of the corpus is distributed with a concordance program which allows users to search a word or a phrase (in continuous or discontinuous forms) or simply a chunk across the concordance and to obtain information about the number of records (here, sentences) and items found, separately. The corpus is available in xml and in Access format.

该英波斯语平行语料库汇聚了超过20万条对齐句子,内容涵盖艺术、法律、文化、科学、宗教、文学、医学、成语、政治等多个文本类型领域。它是ELRA(目录参考:ELRA-W0051)已分发英波斯语平行语料库的扩展。此语料库的新版本附带了一种语料库对照程序,用户可借此搜索单词或短语(无论连续或断续形式)或简单的片段,并在对照中进行搜索,从而获取关于找到的记录(此处指句子)和项目数量的信息。该语料库以XML和Access格式提供。
提供机构:
ELRA Catalogue of Language Resources
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作