five

TAL

收藏
魔搭社区2026-05-08 更新2025-03-08 收录
下载链接:
https://modelscope.cn/datasets/pengzhendong/TAL
下载链接
链接失效反馈
官方服务:
资源简介:
## 成人中文授课音频 语音识别数据集为好未来线上课程的老师授课音频,涵盖语文、数学两门学科。共包含80+说话人,每条音频只有一位说话人。标注数据包含了科目及说话人编号。训练集、验证集、测试集比例为7:1:2(3个文件共9.03G) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **录制时间** | **数据格式** | **音频长度** | |----------|---------|----------|----------|---------|----------|----------|----------| | 100小时 | 16KHz | 16bit | 普通麦克风 | 80+ | 2018年4月 ~ 5月 | 语音:.wav单声道;标注结果:.txt | 1 ~ 60s | ## 成人中文语音情感分类 语音情感数据集为好未来老师上课音频,共包含4541条音频,总时长12.5小时。录音在安静的室内环境中录制,每条音频只有一个说话人。标注包括愉悦度(Pleasure)和激情度(Arousal)两个维度,每个音频片段对应一个P值和A值,范围都在[-3,3]之间,值越大表示愉悦度或激情度越高。(文件1.16GB) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **数据格式** | **音频长度** | **准确率** | |----------|---------|----------|----------|---------|----------|----------|----------| | 12.5小时 | 16KHz | 16bit | 普通麦克风 | 42人,男性:18人,女性:24人 | 语音:.wav单声道;标注结果:.txt | 10s | 96% | ## 成人中英文混合授课音频 该数据集为好未来英语课授课音频,包含中英文混合讲话的情况,每条音频只有一位说话人。(文件63.36G) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **录制时间** | **数据格式** | **音频长度** | **数据类型** | |----------|---------|----------|----------|---------|----------|----------|----------|----------| | 587小时音频 | 16KHz | 16bit | 普通麦克风 | 200+ | 2019年 | 语音:.wav单声道;标注结果:.txt | 1~60s | 英语课教师授课音频 | ## 儿童中文朗读 数据集为好未来线上课程的爆款儿童声音音频,涵盖语文、数学等学科。共包含30+说话人。(文件489.3MB) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **录制时间** | **数据格式** | |----------|---------|----------|----------|---------|----------|----------| | 5.4小时音频 | 16KHz | 16bit | 普通麦克风 | 30+ | 2020-2021年 | 音频:.wav | ## 儿童英文朗读 数据集为好未来线上课程的儿童声音英语音频,共包含30+说话人。(文件424.7MB) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **数据格式** | |----------|---------|----------|----------|---------|----------| | 4.5小时音频 | 16KHz | 16bit | 普通麦克风 | 30+ | 音频:.wav | ## 成人中文朗读 数据集为好未来线上课程老师的声音音频,涵盖语文、数学等学科。共包含100+说话人。(文件146.6G) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **录制时间** | **数据格式** | |----------|---------|----------|----------|---------|----------|----------| | 1750小时音频 | 16KHz | 16bit | 普通麦克风 | 100+ | 2020-2021年 | 音频:.wav | ## 成人英文朗读 数据集为好未来线上英文课程的老师声音音频,共包含100+说话人。(文件12.68G) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **录制时间** | **数据格式** | |----------|---------|----------|----------|---------|----------|----------| | 180小时音频 | 16KHz | 16bit | 普通麦克风 | 100+ | 2020-2021年 | 音频:.wav | ## 成人英文授课音频 数据集为好未来线上英文课程的老师授课音频,共包含100+说话人。(文件12.06G) | **数据规模** | **采样率** | **采样位深** | **录制设备** | **说话人** | **录制时间** | **数据格式** | |----------|---------|----------|----------|---------|----------|----------| | 160小时音频 | 16KHz | 16bit | 普通麦克风 | 100+ | 2020-2021年 | 音频:.wav |

## Adult Chinese Teaching Audio This speech recognition dataset comprises teacher lecture audios from TAL Education Group's online courses, covering two subjects: Chinese and Mathematics. It includes over 80 speakers, with exactly one speaker per audio clip. The annotated data contains the subject and speaker ID. The split ratio of training set, validation set and test set is 7:1:2, with 3 files totaling 9.03 GB. | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Recording Period** | **Data Format** | **Audio Duration** | |----------------|-------------------|------------------------|-------------------------|--------------|----------------------|-----------------|--------------------| | 100 hours | 16KHz | 16bit | Ordinary Microphone | Over 80 | April ~ May 2018 | Audio: Mono .wav; Annotation: .txt | 1 ~ 60 seconds | ## Adult Chinese Speech Emotion Classification This speech emotion dataset consists of teacher lecture audios from TAL Education Group, containing 4541 audio clips in total with an overall duration of 12.5 hours. All recordings are conducted in quiet indoor environments, with exactly one speaker per audio clip. The annotations cover two dimensions: Pleasure and Arousal. Each audio clip corresponds to a P value and an A value, both ranging within [-3, 3], where larger values indicate higher pleasure or arousal. (File size: 1.16 GB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Data Format** | **Audio Duration** | **Accuracy** | |----------------|-------------------|------------------------|-------------------------|--------------|-----------------|--------------------|--------------| | 12.5 hours | 16KHz | 16bit | Ordinary Microphone | 42 speakers (18 male, 24 female) | Audio: Mono .wav; Annotation: .txt | 10 seconds | 96% | ## Adult Bilingual (Chinese-English) Teaching Audio This dataset comprises English class lecture audios from TAL Education Group, which includes speeches mixed with Chinese and English, with exactly one speaker per audio clip. (File size: 63.36 GB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Recording Period** | **Data Format** | **Audio Duration** | **Data Type** | |----------------|-------------------|------------------------|-------------------------|--------------|----------------------|-----------------|--------------------|---------------| | 587 hours of audio | 16KHz | 16bit | Ordinary Microphone | Over 200 | 2019 | Audio: Mono .wav; Annotation: .txt | 1 ~ 60 seconds | English class teacher lecture audio | ## Children's Chinese Reading This dataset features popular children's audio clips from TAL Education Group's online courses, covering subjects such as Chinese and Mathematics. It includes over 30 speakers. (File size: 489.3 MB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Recording Period** | **Data Format** | |----------------|-------------------|------------------------|-------------------------|--------------|----------------------|-----------------| | 5.4 hours of audio | 16KHz | 16bit | Ordinary Microphone | Over 30 | 2020 ~ 2021 | Audio: .wav | ## Children's English Reading This dataset consists of children's English audio clips from TAL Education Group's online courses, with over 30 speakers in total. (File size: 424.7 MB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Data Format** | |----------------|-------------------|------------------------|-------------------------|--------------|-----------------| | 4.5 hours of audio | 16KHz | 16bit | Ordinary Microphone | Over 30 | Audio: .wav | ## Adult Chinese Reading This dataset comprises teacher audio clips from TAL Education Group's online courses, covering subjects such as Chinese and Mathematics, with over 100 speakers total. (File size: 146.6 GB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Recording Period** | **Data Format** | |----------------|-------------------|------------------------|-------------------------|--------------|----------------------|-----------------| | 1750 hours of audio | 16KHz | 16bit | Ordinary Microphone | Over 100 | 2020 ~ 2021 | Audio: .wav | ## Adult English Reading This dataset consists of teacher audio clips from TAL Education Group's online English courses, with over 100 speakers total. (File size: 12.68 GB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Recording Period** | **Data Format** | |----------------|-------------------|------------------------|-------------------------|--------------|----------------------|-----------------| | 180 hours of audio | 16KHz | 16bit | Ordinary Microphone | Over 100 | 2020 ~ 2021 | Audio: .wav | ## Adult English Teaching Audio This dataset comprises teacher lecture audios from TAL Education Group's online English courses, with over 100 speakers total. (File size: 12.06 GB) | **Data Scale** | **Sampling Rate** | **Sampling Bit Depth** | **Recording Equipment** | **Speakers** | **Recording Period** | **Data Format** | |----------------|-------------------|------------------------|-------------------------|--------------|----------------------|-----------------| | 160 hours of audio | 16KHz | 16bit | Ordinary Microphone | Over 100 | 2020 ~ 2021 | Audio: .wav |
提供机构:
maas
创建时间:
2025-03-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作