Southeast University Multimodal Lie Detection Dataset
收藏DataCite Commons2025-04-27 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=c9ef45a0dd89487f904b80901c7a42a4
下载链接
链接失效反馈官方服务:
资源简介:
To address the lack of a Chinese context based lie detection dataset in current research, we have developed SEUMLD, which is the first publicly available multimodal lie detection dataset based on Chinese conversations. SEUMLD contains data in three modalities: video, audio, and electrocardiogram signals. In order to effectively stimulate the participants' motivation to lie, we designed a paradigm of simulated crime and simulated interrogation experiments. By recording multimodal signals of participants during simulated interrogation, SEUMLD collected data from 76 participants who had lived in a Chinese language environment for a long time, totaling 3224 conversations. This dataset provides coarse-grained annotation for identifying whether participants lie throughout the entire conversation, as well as fine-grained annotation for precise segmentation of each conversation.
针对当前研究中缺乏基于中文语境的测谎数据集这一空白,我们构建了SEUMLD——首个面向中文对话的公开多模态测谎数据集。该数据集包含三类模态数据:视频、音频与心电信号。为有效激发参与者的说谎动机,我们设计了模拟犯罪与模拟审讯实验范式。通过记录参与者在模拟审讯过程中的多模态信号,本数据集共收录76名长期处于中文语言环境下的参与者的数据,总计3224段对话。本数据集提供两类标注:其一为用于判别参与者在整段对话中是否说谎的粗粒度标注,其二为用于对单段对话进行精准分段的细粒度标注。
提供机构:
Science Data Bank
创建时间:
2025-03-24
搜集汇总
数据集介绍

背景与挑战
背景概述
Southeast University Multimodal Lie Detection Dataset是首个基于中文对话的公开多模态谎言检测数据集,包含视频、音频和心电图三种模态数据,采集自76名参与者的3224段对话,并采用模拟犯罪实验范式激发说谎行为。数据集提供粗粒度(整段对话)和细粒度(精确分段)两种标注层次,填补了中文语境下谎言检测数据集的空白。
以上内容由遇见数据集搜集并总结生成



