five

billingsmoore/phonetic-tibetan-to-english-translation-pairs

收藏
Hugging Face2024-09-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/billingsmoore/phonetic-tibetan-to-english-translation-pairs
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含一个CSV文件,有10,000,000行,每行是一对句子或短语,第一列是古典藏语的句子或短语,第二列是对应的英文翻译。数据来源于Lotsawa House的文本,并以与原始文本相同的许可证提供。数据集是通过程序抓取、清理和格式化的,但由于从不同翻译者使用的各种结构中组装数据的困难,数据集质量不高,仅适用于概念验证建模。README还提到了一个更小且更干净的版本,以及该数据集用于训练的模型和其在Kaggle上的可用性。

This dataset consists of a single csv with 10,000,000 rows which are pairs of sentences or phrases. The first member of each pair is a sentence or phrase in Classical Tibetan. The second member is the English translation of the first. The pairs are pulled from texts sourced from Lotsawa House and are offered under the same license as the original texts from which they are sourced. This data was scraped, cleaned, and formatted programmatically. Because of the difficulty in assembling data in this way from a variety of structures used by translators, it is not of high quality, and should only be used for proof-of-concept modeling.
提供机构:
billingsmoore
二维码
社区交流群
二维码
科研交流群
商业服务