MOBIO

Mendeley Data2024-05-17 更新2024-06-28 收录

下载链接：

https://zenodo.org/records/4269551

下载链接

链接失效反馈

官方服务：

资源简介：

MOBIO is a dataset for mobile face and speaker recognition. The dataset consists of bi-modal (audio and video) data taken from 150 people. The dataset has a female-male ratio of nearly 1:2 (99 males and 51 females) and was collected from August 2008 until July 2010 in six different sites from five different countries. This led to a diverse bi-modal dataset with both native and non-native English speakers. In total 12 sessions were captured for each client: 6 sessions for Phase I and 6 sessions for Phase II. The Phase I data consists of 21 questions with the question types ranging from: Short Response Questions, Short Response Free Speech, Set Speech, and Free Speech. The Phase II data consists of 11 questions with the question types ranging from: Short Response Questions, Set Speech, and Free Speech. A more detailed description of the questions asked of the clients is provided below. The database was recorded using two mobile devices: a mobile phone and a laptop computer. The mobile phone used to capture the database was a NOKIA N93i mobile while the laptop computer was a standard 2008 MacBook. The laptop was only used to capture part of the first session, this first session consists of data captured on both the laptop and the mobile phone. Detailed Description of Questions Please note that the answers to the Short Response Free Speech and Free Speech questions DO NOT necessarily relate to the question as the sole purpose is to have the subject speaking free speech, therefore, the answers to ALL of these questions are assumed to be false. 1. Short Response Questions The short response questions consisted of five pre-defined questions, which were: What is your name? – the user supplies their fake name What is your address? – the user supplies their fake address What is your birthdate? – the user supplies their fake birthdate What is your license number? – the user supplied their fake ID card number (the same for each person) What is your credit card number? – the user supplies their fake Card number 2. Short Response Free Speech There were five random questions taken form a list of 30-40 questions. The user had to answer these questions by speaking for approximately 5 seconds of recording (sometimes more and sometimes less). 3. Set Speech The users were asked to read pre-defined text out aloud. This text was designed to take longer than 10 seconds to utter and the participants were allowed to correct themselves while reading these paragraphs. The text that was read was: I have signed the MOBIO consent form and I understand that my biometric data is being captured for a database that might be made publicly available for research purposes. I understand that I am solely responsible for the content of my statements and my behaviour. I will ensure that when answering a question I do not provide any personal information in response to any question. 4. Free Speech The free speech session consisted of 10 random questions from a list of approximately 30 questions. The answers to each of these questions took approximately 10 seconds (sometimes less and sometimes more). Acknowledgements Elie Khoury, Laurent El-Shafey, Christopher McCool, Manuel Günther, Sébastien Marcel, “Bi-modal biometric authentication on mobile phones in challenging conditions”, Image and Vision Computing Volume 32, Issue 12, 2014. 10.1016/j.imavis.2013.10.001 https://publications.idiap.ch/index.php/publications/show/2689

MOBIO是一款用于移动端人脸识别与说话人识别的数据集。该数据集包含来自150名受试者的双模态（bi-modal）音频与视频数据。其男女比例接近1:2（男性99人，女性51人），采集时间为2008年8月至2010年7月，采集站点覆盖5个国家的6个不同场地。这使得该双模态数据集涵盖了以英语为母语与非母语的受试者，具备较强的样本多样性。每位受试者总计完成12次采集会话：第一阶段6次，第二阶段6次。第一阶段数据包含21个问题，问题类型涵盖简答题、自由式简短发言、指定文本朗读以及自由发言。第二阶段数据包含11个问题，问题类型涵盖简答题、指定文本朗读以及自由发言。针对受试者的问题详情将在下文中详细说明。本次数据库的录制采用了两种移动设备：一部手机与一台笔记本电脑。其中用于采集数据的手机为诺基亚N93i（NOKIA N93i），笔记本电脑为2008款标准MacBook。仅在第一次会话的部分数据采集中使用了笔记本电脑，该次会话同时采集了手机与笔记本电脑端的数据。 ### 问题详细说明请注意：自由式简短发言与自由发言类问题的答案无需与问题内容直接相关，其核心目的仅为获取受试者的自由语音片段，因此默认所有此类问题的答案均为虚假无效内容。 1. 简答题该类问题包含5个预设问题，具体如下： - 你叫什么名字？——受试者需提供虚假姓名 - 你的住址是什么？——受试者需提供虚假住址 - 你的出生日期是什么？——受试者需提供虚假出生日期 - 你的证件号码是什么？——受试者需提供虚假身份证件号码（每人使用同一号码） - 你的信用卡号是什么？——受试者需提供虚假信用卡号码 2. 自由式简短发言从30至40个问题的题库中随机选取5个问题，受试者需进行约5秒的语音回答（实际时长可能略有增减）。 3. 指定文本朗读要求受试者大声朗读预设文本。该文本的朗读时长预计超过10秒，朗读过程中允许受试者自行修正读错的内容。朗读文本为：我已签署MOBIO知情同意书，知晓本人的生物特征数据将被采集至数据库，该数据库可能会公开用于科研用途。我清楚本人需对自身陈述的内容及行为承担全部责任。在回答问题时，我将确保不会向任何问题泄露个人信息。 4. 自由发言该类会话包含从约30个问题的题库中随机选取的10个问题，每个问题的回答时长约为10秒（实际时长可能略有增减）。 ## 致谢本数据集相关研究由Elie Khoury、Laurent El-Shafey、Christopher McCool、Manuel Günther、Sébastien Marcel完成，成果发表于《Image and Vision Computing》2014年第32卷第12期，DOI：10.1016/j.imavis.2013.10.001，原文链接：https://publications.idiap.ch/index.php/publications/show/2689

创建时间：

2023-06-28

搜集汇总

数据集介绍

背景与挑战

背景概述

MOBIO数据集是一个用于移动设备人脸和说话人识别的双模态（音频和视频）数据集，包含150名参与者的数据，具有性别比例近1:2（99名男性、51名女性）和跨五国六地的多国多样性，涉及英语母语和非母语者。数据通过诺基亚N93i手机和笔记本电脑采集，分为两个阶段（共12个会话），问题类型包括简短回答、自由发言和固定文本朗读，旨在支持在挑战性条件下的生物识别研究。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集