five

SMYLE

收藏
DataCite Commons2026-02-10 更新2026-05-04 收录
下载链接:
https://www.ortolang.fr/market/item/smyle/v1
下载链接
链接失效反馈
官方服务:
资源简介:
SMYLE is a multimodal corpus in French (16h) including audio-video and neuro-physiological data from 60 participants engaged in face-to-face storytelling (8.2h) and free conversation tasks (7.8h). This corpus covers all modalities, precisely synchronized. It constitutes one of the first corpus of this size offering the opportunity to investigate cognitive characteristics of spontaneous conversation including at the brain level. The storytelling task comprises two conditions: a storyteller talking with a ``normal'' or a ``distracted'' listener. The goal of SMYLE is to address unexplored aspects of conversation such as the cognitive processes involved to build a common ground and achieve mutual understanding. SMYLE presents each pair of participants in both controlled and uncontrolled context with pre-defined discursive roles for the narration but free ones for conversation.Sixty participants took part in the experiment (mean age = 22.77, sd = 3.29 , min = 18, max = 36). Forty-three participants were female and 17 participants were male. Fifty-four participants were students of different levels and fields, five were employed and one was unemployed. Participants were recruited from the Aix-Marseille University and from the mailing list of Laboratoire Parole et Langage. All participants were native French speakers, right-handed, and reported no neurological or language disorders. The experiment was conducted in November at Laboratoire Parole et Langage at Aix-en-Provence, France. The participants received a compensation of 30€. None of the participant dyads knew each other before the experiment. All sixty participants were recorded in all modalities, except one participant with no EEG data and one storyteller with no video for Task 1 due to recording failure, both in the normal condition.Several levels of annotations are performed on the corpus. In order to reduce as much as possible the manual annotations, we have used several software programs to perform automatic annotations. So far, all automatic annotations have been performed and manual corrections are in progress. The manual annotations and corrections are performed by three expert annotators.The annotations include: - Gaze- Smile (neutral face, low intensity smile, high intensity smile)- Orthographic Transcription (including laughter, broken word, elision, repetition, personal information)- Head movements (nod, shake, tilt, other)- Feedback (generic and specific feedback including details annotations such as wince, eyebrow movements, shurg, etc.)- Momel Intint and OpenSmile prosodic annotations- Aligned tokens, part-of-speech and phonetization- Speech activity (speech, laugh, silence)Audio and video files will be made available in September after automatic anonymization of participants' personal data, on the basis of orthographic transcriptions, for members of the scientific community only. Please contact us for access to the corpus. You will be asked to sign a user agreement; we reserve the right to refuse/restrict access to certain types of data.
提供机构:
ORTOLANG (Open Resources and TOols for LANGuage) - www.ortolang.fr
创建时间:
2026-02-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作