five

Translanguage English Database (TED) Transcripts

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2002T03
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3><br> <p>Translanguage English Database (TED) Transcripts consists of transcripts of presentations by 39 native English and non-native English speakers at the&nbsp;<a href="http://dblp.uni-trier.de/db/conf/interspeech/eurospeech1993.html">Third European Conference on Speech Communication and Technology, EUROSPEECH 1993</a> in Berlin, Germany. This is a joint publication with the <a href="http://www.elra.info/" rel="nofollow">European Language Resources Association (ELRA)</a> sponsored in part by <a href="http://www.nsf.gov/" rel="nofollow">National Science Foundation</a> Grant No. IIS-9982201. The data set is&nbsp;released&nbsp;by ELRA&nbsp;as&nbsp;Translanguage English Database (TED) Transcripts database (<a href="http://catalog.elra.info/product_info.php?products_id=637">ELRA-S0120</a>).&nbsp;&nbsp;</p><br> <h3>Data</h3><br> <p>The transcripts in this release were developed by the Linguistic Data Consortium and are a subset of the speech recordings in Translanguage English Database (TED) Speech <a href="http://catalog.ldc.upenn.edu/LDC2002S04" rel="nofollow">LDC2002S04</a> and ELRA publication <a href="http://catalog.elra.info/product_info.php?products_id=743">ELRA-S0031</a>.&nbsp;</p><br> <p>The transcripts are in Universal Transcription Format (UTF). All UTF files were validated against a utf.dtd. Tables containing speaker demographic information and cross-references of file names from the TED audio corpus are included this release. A transcript sample is available <a href="desc/addenda/LDC2002T03.01.example" rel="nofollow">here</a>.</p><br> <h2>Updates</h2><br> <p>There are no updates at this time</p></br> Portions © 1993-2002 University of Munich, LIMSI-CNRS, ELRA, © 2002 Trustees of the University of Pennsylvania

<h3>简介</h3><br><p>跨语言英语数据库(Translanguage English Database,简称TED)语料转录本,包含39位以英语为母语及非英语母语演讲者在<a href="http://dblp.uni-trier.de/db/conf/interspeech/eurospeech1993.html">第三届欧洲语音通信与技术大会(EUROSPEECH 1993)</a>(德国柏林举办)的演讲转录内容。本数据集为与<a href="http://www.elra.info/" rel="nofollow">欧洲语言资源协会(European Language Resources Association,ELRA)</a>联合发布的成果,部分资助来自<a href="http://www.nsf.gov/" rel="nofollow">美国国家科学基金会(National Science Foundation)</a>编号为IIS-9982201的项目拨款。本数据集由ELRA以跨语言英语数据库(TED)语料转录本数据库(<a href="http://catalog.elra.info/product_info.php?products_id=637">ELRA-S0120</a>)的名义发布。</p><br><h3>数据</h3><br><p>本次发布的转录本由语言数据联盟(Linguistic Data Consortium,LDC)制作,为跨语言英语数据库(TED)语音语料(<a href="http://catalog.ldc.upenn.edu/LDC2002S04" rel="nofollow">LDC2002S04</a>)及ELRA发布的<a href="http://catalog.elra.info/product_info.php?products_id=743">ELRA-S0031</a>中语音录音的子集。</p><br><p>本批转录本采用通用转录格式(Universal Transcription Format,UTF)编写,所有UTF格式文件均经过utf.dtd规范校验。本次发布内容还包含演讲者人口统计信息表格,以及TED音频语料库的文件名交叉引用表。转录本示例可<a href="desc/addenda/LDC2002T03.01.example" rel="nofollow">在此查看</a>。</p><br><h2>更新说明</h2><br><p>目前暂无更新记录</p><br><p>部分内容 © 1993-2002 慕尼黑大学、LIMSI-CNRS、ELRA,© 2002 宾夕法尼亚大学校董会</p>
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作