Speech Accent Archive
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/Speech_Accent_Archive
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含2140语音样本,每个语音样本来自不同的讲话者阅读相同的阅读段落。谈话者来自177国家,有214不同的母语。每个谈话者都在说英语。
来自177个国家的平行英语语音样本。
每个说一种语言的人都带着口音说话。特定的口音本质上反映了一个人的语言背景。当人们用与自己不同的口音听某人说话时,他们会注意到其中的区别,甚至可能对说话者做出某些有偏见的社会判断。
建立语音重音档案是为了统一展示来自各种语言背景的大量语音重音。以英语为母语的人和非以英语为母语的人都读了相同的英语段落,并被仔细记录下来。档案被构建为教学工具和研究工具。它旨在供语言学家以及其他只想听和比较不同英语使用者的口音的人使用。
此数据集允许您比较说话者的人口统计学和语言背景,以确定哪些变量是每个口音的关键预测因素。语音重音档案表明,重音是系统的,而不仅仅是错误的语音。
所有对口音的语言分析都可以供公众审查。我们欢迎对我们的转录和分析的准确性发表评论。
This dataset contains 2,140 speech samples, each sourced from a distinct speaker reading an identical passage of text. The speakers hail from 177 countries and speak 214 distinct native languages, with all speakers using English for their recordings.
Parallel English speech samples from 177 countries.
Every speaker uses English with an accent that inherently reflects their linguistic background. When listeners encounter speech with an accent different from their own, they often perceive the distinction and may even form biased social judgments about the speaker.
The Speech Accent Archive was developed to uniformly showcase a large collection of English accents from diverse linguistic backgrounds. Both native and non-native English speakers read the same English passage, and their speech was carefully recorded. The archive is constructed as both an educational and research tool, intended for use by linguists and other individuals who wish to listen to and compare the accents of different English speakers.
This dataset enables users to compare speakers’ demographic and linguistic backgrounds to identify which variables serve as key predictors of each accent. The Speech Accent Archive demonstrates that accents are systematic, rather than merely erroneous speech.
All linguistic analyses of the accents are available for public review, and we welcome comments on the accuracy of our transcriptions and analyses.
提供机构:
OpenDataLab
创建时间:
2023-04-20
搜集汇总
数据集介绍

背景与挑战
背景概述
Speech Accent Archive数据集包含2140个来自177个国家的英语语音样本,每个样本由不同母语者朗读相同段落,用于口音比较和研究。数据集由乔治梅森大学于2013年发布,旨在作为教学和研究工具。
以上内容由遇见数据集搜集并总结生成



