中英数据文件
收藏阿里云天池2026-06-02 更新2025-05-10 收录
下载链接:
https://tianchi.aliyun.com/dataset/203546
下载链接
链接失效反馈官方服务:
资源简介:
中英数据文件
** Info **
Check for newest version here:
http://www.manythings.org/anki/
Date of this file:
2024-04-01
This data is from the sentences_detailed.csv file from tatoeba.org.
http://tatoeba.org/files/downloads/sentences_detailed.csv
** Terms of Use **
See the terms of use.
These files have been released under the same license as the
source.
http://tatoeba.org/eng/terms_of_use
http://creativecommons.org/licenses/by/2.0
Attribution: www.manythings.org/anki and tatoeba.org
** Warnings **
The data from the Tatoeba Project contains errors.
To lower the number of errors you are likely to see, only
sentences by native speakers and proofread sentences have
been included.
For the non-English language, I made these (possibly wrong)
assumptions.
Assumption 1: Sentences written by native speakers can be
trusted.
Assumption 2: Contributors to the Tatoeba Project are honest
about what their native language is.
For English, I used the sentences that I have proofread
and thought were OK.
Of course, I may have missed a few errors.
** Downloading Anki **
See http://ankisrs.net/
** Importing into Anki **
Information is at http://ankisrs.net/docs/manual.html#importing
Of particular interest may be about "duplicates" at http://ankisrs.net/docs/manual.html#duplicates-and-updating.
You can choose:
1. not to allow duplicates (alternate translations) as cards.
2. allow duplicates (alternate translations) as cards.
中英双语数据集文件
** 信息 **
可在此处查看本数据集的最新版本:http://www.manythings.org/anki/
本文件生成日期:2024年4月1日
本数据集源自tatoeba.org的sentences_detailed.csv文件,下载地址为:http://tatoeba.org/files/downloads/sentences_detailed.csv
** 使用条款 **
请参阅相关使用条款。本数据集文件采用与源数据一致的许可协议发布。
相关许可链接:
塔托埃巴(Tatoeba)项目使用条款:http://tatoeba.org/eng/terms_of_use
知识共享署名2.0许可协议:http://creativecommons.org/licenses/by/2.0
本数据集的署名来源为:www.manythings.org/anki 与 tatoeba.org
** 警告说明 **
塔托埃巴项目的数据集可能存在错误。为降低您可能遇到的错误数量,本数据集仅收录母语使用者撰写的语句以及经过校对的语句。针对非英语语种,我做出了以下(可能存在偏差的)假设:
假设1:母语使用者撰写的语句可被信任;
假设2:塔托埃巴项目的贡献者会如实标注其母语能力。
针对英语语种,我仅使用了本人经过校对且认为合格的语句。当然,我可能仍遗漏了部分错误。
** Anki下载指南 **
请访问 http://ankisrs.net/ 了解Anki下载相关信息。
** 导入Anki **
导入Anki的详细操作说明请参阅:http://ankisrs.net/docs/manual.html#importing。其中关于“重复项”的说明尤为关键,相关链接为:http://ankisrs.net/docs/manual.html#duplicates-and-updating。您可选择以下两种导入模式:
1. 不允许将重复项(即不同译法)作为卡片;
2. 允许将重复项(即不同译法)作为卡片。
提供机构:
阿里云天池
创建时间:
2025-05-08
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集名为'中英数据文件',源自Tatoeba项目的sentences_detailed.csv文件,提供中英双语学习资源,包含母语者撰写和校对的双语句子,但可能存在少量错误。数据基于知识共享许可发布,适用于Anki等学习工具,当前因平台维护下载服务暂时不可用。
以上内容由遇见数据集搜集并总结生成



