ameykaran/english-text-corpus
收藏Hugging Face2025-09-16 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/ameykaran/english-text-corpus
下载链接
链接失效反馈官方服务:
资源简介:
这是一个英文文本语料库,通过从IndicCorpV2英文语料库中收集和清洗数据而形成。数据集包含验证集、测试集和训练集,分别存储在不同的JSON Lines文件中。
This is an English text corpus, formed by collecting and cleaning data from the IndicCorpV2 English corpus. The dataset includes validation, test, and training sets, stored in different JSON Lines files respectively.
提供机构:
ameykaran



