---
annotations_creators:
- other
language_creators:
- other
language:
- en
license:
- odc-by
multilinguality:
- monolingual
size_categories:
- 100K<n<1M
source_datasets:
- extended|vctk
task_categories:
- audio-classification
task_ids: []
pretty_name: asvspoof2019
tags:
- voice-anti-spoofing
---
# Dataset Card for asvspoof2019
## Table of Contents
- [Dataset Description](#dataset-description)
- [Dataset Summary](#dataset-summary)
- [Supported Tasks](#supported-tasks-and-leaderboards)
- [Languages](#languages)
- [Dataset Structure](#dataset-structure)
- [Data Instances](#data-instances)
- [Data Fields](#data-instances)
- [Data Splits](#data-instances)
- [Dataset Creation](#dataset-creation)
- [Curation Rationale](#curation-rationale)
- [Source Data](#source-data)
- [Annotations](#annotations)
- [Personal and Sensitive Information](#personal-and-sensitive-information)
- [Considerations for Using the Data](#considerations-for-using-the-data)
- [Social Impact of Dataset](#social-impact-of-dataset)
- [Discussion of Biases](#discussion-of-biases)
- [Other Known Limitations](#other-known-limitations)
- [Additional Information](#additional-information)
- [Dataset Curators](#dataset-curators)
- [Licensing Information](#licensing-information)
- [Citation Information](#citation-information)
## Dataset Description
- **Homepage:** https://datashare.ed.ac.uk/handle/10283/3336
- **Repository:** [Needs More Information]
- **Paper:** https://arxiv.org/abs/1911.01601
- **Leaderboard:** [Needs More Information]
- **Point of Contact:** [Needs More Information]
### Dataset Summary
This is a database used for the Third Automatic Speaker Verification Spoofing
and Countermeasuers Challenge, for short, ASVspoof 2019 (http://www.asvspoof.org)
organized by Junichi Yamagishi, Massimiliano Todisco, Md Sahidullah, Héctor
Delgado, Xin Wang, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Ville Vestman,
and Andreas Nautsch in 2019.
### Supported Tasks and Leaderboards
[Needs More Information]
### Languages
English
## Dataset Structure
### Data Instances
```
{'speaker_id': 'LA_0091',
'audio_file_name': 'LA_T_8529430',
'audio': {'path': 'D:/Users/80304531/.cache/huggingface/datasets/downloads/extracted/8cabb6d5c283b0ed94b2219a8d459fea8e972ce098ef14d8e5a97b181f850502/LA/ASVspoof2019_LA_train/flac/LA_T_8529430.flac',
'array': array([-0.00201416, -0.00234985, -0.0022583 , ..., 0.01309204,
0.01339722, 0.01461792], dtype=float32),
'sampling_rate': 16000},
'system_id': 'A01',
'key': 1}
```
### Data Fields
Logical access (LA):
- `speaker_id`: `LA_****`, a 4-digit speaker ID
- `audio_file_name`: name of the audio file
- `audio`: A dictionary containing the path to the downloaded audio file, the decoded audio array, and the sampling rate. Note that when accessing the audio column: `dataset[0]["audio"]` the audio file is automatically decoded and resampled to `dataset.features["audio"].sampling_rate`. Decoding and resampling of a large number of audio files might take a significant amount of time. Thus it is important to first query the sample index before the `"audio"` column, *i.e.* `dataset[0]["audio"]` should **always** be preferred over `dataset["audio"][0]`.
- `system_id`: ID of the speech spoofing system (A01 - A19), or, for bonafide speech SYSTEM-ID is left blank ('-')
- `key`: 'bonafide' for genuine speech, or, 'spoof' for spoofing speech
Physical access (PA):
- `speaker_id`: `PA_****`, a 4-digit speaker ID
- `audio_file_name`: name of the audio file
- `audio`: A dictionary containing the path to the downloaded audio file, the decoded audio array, and the sampling rate. Note that when accessing the audio column: `dataset[0]["audio"]` the audio file is automatically decoded and resampled to `dataset.features["audio"].sampling_rate`. Decoding and resampling of a large number of audio files might take a significant amount of time. Thus it is important to first query the sample index before the `"audio"` column, *i.e.* `dataset[0]["audio"]` should **always** be preferred over `dataset["audio"][0]`.
- `environment_id`: a triplet (S,R,D_s), which take one letter in the set {a,b,c} as categorical value, defined as
| | a | b | c |
| -------------------------------- | ------ | ------- | -------- |
| S: Room size (square meters) | 2-5 | 5-10 | 10-20 |
| R: T60 (ms) | 50-200 | 200-600 | 600-1000 |
| D_s: Talker-to-ASV distance (cm) | 10-50 | 50-100 | 100-150 |
- `attack_id`: a duple (D_a,Q), which take one letter in the set {A,B,C} as categorical value, defined as
| | A | B | C |
| ----------------------------------- | ------- | ------ | ----- |
| Z: Attacker-to-talker distance (cm) | 10-50 | 50-100 | > 100 |
| Q: Replay device quality | perfect | high | low |
for bonafide speech, `attack_id` is left blank ('-')
- `key`: 'bonafide' for genuine speech, or, 'spoof' for spoofing speech
### Data Splits
| | Training set | Development set | Evaluation set |
| -------- | ------------ | --------------- | -------------- |
| Bonafide | 2580 | 2548 | 7355 |
| Spoof | 22800 | 22296 | 63882 |
| Total | 25380 | 24844 | 71237 |
## Dataset Creation
### Curation Rationale
[Needs More Information]
### Source Data
#### Initial Data Collection and Normalization
[Needs More Information]
#### Who are the source language producers?
[Needs More Information]
### Annotations
#### Annotation process
[Needs More Information]
#### Who are the annotators?
[Needs More Information]
### Personal and Sensitive Information
[Needs More Information]
## Considerations for Using the Data
### Social Impact of Dataset
[Needs More Information]
### Discussion of Biases
[Needs More Information]
### Other Known Limitations
[Needs More Information]
## Additional Information
### Dataset Curators
[Needs More Information]
### Licensing Information
This ASVspoof 2019 dataset is made available under the Open Data Commons Attribution License: http://opendatacommons.org/licenses/by/1.0/
### Citation Information
```
@InProceedings{Todisco2019,
Title = {{ASV}spoof 2019: {F}uture {H}orizons in {S}poofed and {F}ake {A}udio {D}etection},
Author = {Todisco, Massimiliano and
Wang, Xin and
Sahidullah, Md and
Delgado, H ́ector and
Nautsch, Andreas and
Yamagishi, Junichi and
Evans, Nicholas and
Kinnunen, Tomi and
Lee, Kong Aik},
booktitle = {Proc. of Interspeech 2019},
Year = {2019}
}
```
---
annotations_creators:
- 其他
language_creators:
- 其他
language:
- 英语
license:
- 开放数据 Commons 署名许可(odc-by)
multilinguality:
- 单语言
size_categories:
- 100K<n<1M
source_datasets:
- 扩展|vctk
task_categories:
- 音频分类
task_ids: []
pretty_name: asvspoof2019
tags:
- 语音反欺骗
---
# ASVspoof 2019 数据集卡片
## 目录
- [数据集描述](#dataset-description)
- [数据集摘要](#dataset-summary)
- [支持任务与排行榜](#supported-tasks-and-leaderboards)
- [语言](#languages)
- [数据集结构](#dataset-structure)
- [数据实例](#data-instances)
- [数据字段](#data-fields)
- [数据划分](#data-splits)
- [数据集构建](#dataset-creation)
- [数据集构建初衷](#curation-rationale)
- [源数据](#source-data)
- [注释信息](#annotations)
- [个人与敏感信息](#personal-and-sensitive-information)
- [数据集使用注意事项](#considerations-for-using-the-data)
- [数据集的社会影响](#social-impact-of-dataset)
- [偏差讨论](#discussion-of-biases)
- [其他已知局限](#other-known-limitations)
- [附加信息](#additional-information)
- [数据集维护者](#dataset-curators)
- [许可信息](#licensing-information)
- [引用信息](#citation-information)
## 数据集描述
- **主页**: https://datashare.ed.ac.uk/handle/10283/3336
- **代码仓库**: [待补充]
- **论文**: https://arxiv.org/abs/1911.01601
- **排行榜**: [待补充]
- **联系人**: [待补充]
### 数据集摘要
本数据集用于第三届自动语音验证欺骗与对抗挑战赛事(简称ASVspoof 2019,http://www.asvspoof.org),该赛事由Junichi Yamagishi、Massimiliano Todisco、Md Sahidullah、Héctor Delgado、Xin Wang、Nicholas Evans、Tomi Kinnunen、Kong Aik Lee、Ville Vestman及Andreas Nautsch于2019年组织。
### 支持任务与排行榜
[待补充]
### 语言
英语
## 数据集结构
### 数据实例
{'speaker_id': 'LA_0091',
'audio_file_name': 'LA_T_8529430',
'audio': {'path': 'D:/Users/80304531/.cache/huggingface/datasets/downloads/extracted/8cabb6d5c283b0ed94b2219a8d459fea8e972ce098ef14d8e5a97b181f8529430/LA/ASVspoof2019_LA_train/flac/LA_T_8529430.flac',
'array': array([-0.00201416, -0.00234985, -0.0022583 , ..., 0.01309204,
0.01339722, 0.01461792], dtype=float32),
'sampling_rate': 16000},
'system_id': 'A01',
'key': 1}
### 数据字段
#### 逻辑访问(LA)子集:
- `speaker_id`: 格式为`LA_****`,为4位说话人标识符
- `audio_file_name`: 音频文件名
- `audio`: 包含音频文件路径、解码后的音频数组及采样率的字典。请注意,当访问音频列时:`dataset[0]["audio"]`会自动对音频文件进行解码并重采样至`dataset.features["audio"].sampling_rate`指定的采样率。解码与重采样大量音频文件可能耗费较长时间,因此建议优先通过样本索引访问音频列,即**始终优先使用`dataset[0]["audio"]`而非`dataset["audio"][0]`**。
- `system_id`: 语音欺骗系统的标识符(取值范围为A01至A19);对于真实语音样本,该字段留空(值为`'-'`)
- `key`: 真实语音样本标注为`bonafide`,欺骗语音样本标注为`spoof`
#### 物理访问(PA)子集:
- `speaker_id`: 格式为`PA_****`,为4位说话人标识符
- `audio_file_name`: 音频文件名
- `audio`: 包含音频文件路径、解码后的音频数组及采样率的字典。请注意,当访问音频列时:`dataset[0]["audio"]`会自动对音频文件进行解码并重采样至`dataset.features["audio"].sampling_rate`指定的采样率。解码与重采样大量音频文件可能耗费较长时间,因此建议优先通过样本索引访问音频列,即**始终优先使用`dataset[0]["audio"]`而非`dataset["audio"][0]`**。
- `environment_id`: 三元组`(S,R,D_s)`,取值为集合`{a,b,c}`中的单字符类别值,定义如下:
| | a | b | c |
| -------------------------------- | ------ | ------- | -------- |
| S: 房间面积(平方米) | 2-5 | 5-10 | 10-20 |
| R: 混响时间T60(毫秒) | 50-200 | 200-600 | 600-1000 |
| D_s: 说话人与自动语音验证系统的距离(厘米) | 10-50 | 50-100 | 100-150 |
- `attack_id`: 二元组`(D_a,Q)`,取值为集合`{A,B,C}`中的单字符类别值,定义如下:
| | A | B | C |
| ----------------------------------- | ------- | ------ | ----- |
| Z: 攻击者与说话人的距离(厘米) | 10-50 | 50-100 | > 100 |
| Q: 回放设备质量 | 完美 | 高 | 低 |
对于真实语音样本,`attack_id`字段留空(值为`'-'`)
- `key`: 真实语音样本标注为`bonafide`,欺骗语音样本标注为`spoof`
### 数据划分
| | 训练集 | 开发集 | 评估集 |
| -------- | ------ | ------ | ------ |
| 真实语音 | 2580 | 2548 | 7355 |
| 欺骗语音 | 22800 | 22296 | 63882 |
| 总计 | 25380 | 24844 | 71237 |
## 数据集构建
### 数据集构建初衷
[待补充]
### 源数据
#### 初始数据收集与归一化
[待补充]
#### 源语言生成者是谁?
[待补充]
### 注释信息
#### 注释流程
[待补充]
#### 注释人员是谁?
[待补充]
### 个人与敏感信息
[待补充]
## 数据集使用注意事项
### 数据集的社会影响
[待补充]
### 偏差讨论
[待补充]
### 其他已知局限
[待补充]
## 附加信息
### 数据集维护者
[待补充]
### 许可信息
本ASVspoof 2019数据集采用开放数据 Commons 署名许可(odc-by)发布,详情见:http://opendatacommons.org/licenses/by/1.0/
### 引用信息
@InProceedings{Todisco2019,
Title = {{ASV}spoof 2019: {F}uture {H}orizons in {S}poofed and {F}ake {A}udio {D}etection},
Author = {Todisco, Massimiliano and
Wang, Xin and
Sahidullah, Md and
Delgado, H ́ector and
Nautsch, Andreas and
Yamagishi, Junichi and
Evans, Nicholas and
Kinnunen, Tomi and
Lee, Kong Aik},
booktitle = {Proc. of Interspeech 2019},
Year = {2019}
}