medical-symptoms-english-audio

Name: medical-symptoms-english-audio
Creator: maas
Published: 2025-10-03 16:43:00
License: 暂无描述

魔搭社区2025-10-03 更新2025-08-16 收录

下载链接：

https://modelscope.cn/datasets/Kratos-AI/medical-symptoms-english-audio

下载链接

链接失效反馈

官方服务：

资源简介：

# Medical Symptoms English Audio Dataset *This dataset contains intentionally low-quality (“B-grade”) data. It has been curated to include noisy, imperfect, or otherwise suboptimal samples for the purpose of testing model robustness and performance under degraded input conditions **Text spoken by all participants:** "Doctor, I'm constantly tired, like a heavy fog I can't shake. Sharp headaches hit, worse at night, and sleep is tough. I get dizzy, and my stomach feels uneasy after meals. I'm really worried it’s serious. Please help me figure out what's wrong." The dataset supports training and evaluation of models in: - Automatic Speech Recognition (ASR) - Emotional tone classification - Voice synthesis and generation - Emotion-aware conversational agents --- ## Intended Uses ### ✅ Direct Use - Training and benchmarking ASR models with Indian-accented English - Emotion detection and classification from voice - Research in affective computing and empathetic AI ### ❌ Out-of-Scope Use - Real-time or production-grade systems - Commercial use without proper CC BY 4.0 attribution - Clinical or diagnostic use cases --- ## Considerations and Limitations - ❗ The dataset is small (<1,000 samples) and not fully representative of India's linguistic and emotional diversity - 💡 Emotions are subjective — classification results may vary by listener or model - 🔄 Future versions will aim to expand multilingual support and speaker diversity --- ## License **CC BY 4.0** — You can use, modify, and share the dataset with appropriate credit. --- ## Contact - For queries or collaborations related to datasets, contact at : - anoushka@kgen.io - abhishek.vadapalli@kgen.io ---

# 医学症状英语音频数据集 *本数据集包含故意制作的低质量（“B级”）数据。其甄选纳入了带有噪声、存在瑕疵或其他欠佳表现的样本，旨在测试模型在退化输入条件下的鲁棒性与性能表现。 **所有参与者的朗读文本：** “医生，我总是感到疲惫不堪，仿佛被一团挥之不去的浓雾笼罩。夜间会出现剧烈头痛，且睡眠困难。我时常感到眩晕，餐后胃部也会不适。我十分担心病情严重，恳请您帮我查明病因。” 本数据集可用于以下场景下的模型训练与评估： - 自动语音识别（Automatic Speech Recognition, ASR） - 情感语调分类 - 语音合成与生成 - 情感感知对话AI智能体（AI Agent） --- ## 预期用途 ### ✅ 直接使用场景 - 针对印度口音英语的自动语音识别模型训练与基准测试 - 从语音中开展情感检测与分类 - 情感计算与共情式人工智能领域的研究 ### ❌ 不适用场景 - 实时或生产级系统 - 未遵循CC BY 4.0协议进行署名的商业使用 - 临床或诊断相关场景 --- ## 注意事项与局限性 - ❗ 本数据集规模较小（样本量不足1000条），未能完全涵盖印度的语言与情感多样性 - 💡 情感具有主观性——分类结果可能因听众或模型不同而存在差异 - 🔄 未来版本将致力于拓展多语言支持与说话人多样性 --- ## 许可证 **CC BY 4.0** — 您可在标注适当来源的前提下使用、修改与共享本数据集。 --- ## 联系方式 - 若有关于本数据集的咨询或合作需求，请联系： - anoushka@kgen.io - abhishek.vadapalli@kgen.io

提供机构：

maas

创建时间：

2025-08-01

搜集汇总

数据集介绍