five

SEEDLingS 13 Month

收藏
DataCite Commons2021-12-26 更新2025-04-16 收录
下载链接:
http://databrary.org/volume/669
下载链接
链接失效反馈
官方服务:
资源简介:
These files are part of our longitudinal study, Study of Environmental Effects on Developing Linguistic Skills (SEEDLingS). This volume only includes recordings taken at 13 months of age. [For boring logistical reasons, each month has its own databrary volume though we are likely to consolidate to one DOI in the future] We are sharing these volumes with the community as we finish cleaning them, while several papers using this data are still in the pipeline in the spirit of open science. Our request of all users is that if you create annotations or analyses (e.g. coding of some dimension of the content by hand, automated algorithmic output, etc.) using these data that you share those resources back with us and the broader community so we can continue to enrich this resource. The broader project is described below: SEEDLingS is a project exploring how infants' early linguistic and environmental input plays a role in their learning. We focus on understanding how babies learn words between 6 and 18 months of age from the visual, social, and linguistic world around them. By looking at the complex environment that babies are exposed to, from their perspective, we can attempt to decode how the developing mind interprets and organizes the objects and words it faces. SEEDLingS is unique in that it combines well-controlled studies in the lab that assess what words infants know, with in-the-home audio and video recordings of what words infants hear, and what they see when they hear these words. Video and audio recordings were generated in the home every month, from 6 to 17 months of age, for a set of 44 infants (with an addition 2 infants who enrolled but then did not end up participating). The goal of this study is to assess infants' language growth over this time period, particularly in the word learning domain. Every two months, infants came into the lab for an eye-tracking study to test their word comprehension (and for older infants, their word production). This volume will eventually include the audio and video recordings from 13 month home visits; right now it's just the videos as we're doing a final cleaning pass on the audio (see details below). The day-long audio recordings were generated using child-perspective LENA recorders (LENA Research Foundation, Boulder, Colorado, United States) worn by the infant. The audio recordings are generated from one single LENA audio recording. The hour-long video recordings show a composite view of infants' typical lives with 1-4 camera feeds. In the standard setup, infants are equipped with 2 headcams, and a centralized camcorder that captures the entire room. The precise arrangement and number of cameras varies per video, as a function of whether the child would wear the hat with the cameras, and whether the cameras' files became corrupt during the recordings. Shared recordings have been scrubbed for certain personal information (e.g. full names, addresses, etc.); this leads to some silent periods on the audio track and some black-out periods on the video track. Only sections of the files that have been verified to contain no extremely personal content by human listeners (or from which such info has been scrubbed) are shared here. If you notice anything that you believe we may have missed in terms of personal information, we ask that you please let us know as soon as possible so we can rectify the issue. Infants in this sample are from the upstate New York area. The sample is generally middle class, with a range of income and an above-average maternal education level. The sample is predominantly white. All infants heard majority English at home (>75%) and had no known vision or hearing issues at birth. These data were collected at the University of Rochester, and continue to be analyzed presently at Duke University. PRIVACY AND SCRUBBING INFORMATION: All video files have been completely reviewed and have had sections containing personal information removed. These scrubbed portions of the video will appear blacked-out and silenced. Selected portions of the audio files have been reviewed. Sections containing personal information and the unreviewed sections of the audio files have been silenced. (For month 8-13 ~4 hours/audio file were reviewed for annotation and personal information scrubbing; for month 14-17 ~3 hours). Spreadsheets for each month listing timestamps for the scrubbed regions of both audios and videos are forthcoming! This volume was most recently updated in March 2021. Please contact Elika Bergelson directly to discuss further aspects of the sample design, annotation, and analysis at elika.bergelson@gmail.com

本部分文件隶属于我们的纵向研究(longitudinal study)——环境对语言发展技能的影响研究(Study of Environmental Effects on Developing Linguistic Skills, SEEDLingS)。本数据集仅包含婴儿13月龄时的录制数据。[出于繁琐的后勤考量,每个月龄对应一个独立的Databrary数据集卷,未来我们有望将其整合为单一数字对象标识符(Digital Object Identifier, DOI)]。本着开放科学的精神,我们在完成数据清理工作后将本数据集分享给学界社区,目前还有多篇使用该数据集的研究论文处于筹备阶段。我们恳请所有使用者:若您基于本数据集开展标注或分析工作(例如手动编码内容的某一维度、自动化算法输出结果等),请将相关资源与我们及更广泛的学界社区共享,以便我们持续丰富这一研究资源。 本研究的整体框架如下:SEEDLingS是一项探究婴儿早期语言发展与环境输入如何影响其学习过程的项目。我们重点关注婴儿在6至18月龄期间如何从周遭的视觉、社交与语言环境中习得词汇。通过从婴儿的视角切入,剖析他们所接触的复杂环境,我们试图解码发育中的大脑如何理解并组织其所面临的事物与词汇。 SEEDLingS的独特之处在于,它将实验室中用于评估婴儿词汇掌握情况的严格控制实验,与婴儿家中的音视频录制数据相结合——后者记录了婴儿所听到的词汇以及他们听到这些词汇时所处的视觉场景。研究团队为44名婴儿(另有2名婴儿入组但最终未参与研究)在6至17月龄期间每月进行一次家庭音视频录制。本研究旨在评估婴儿在这一阶段的语言发展水平,尤其是词汇学习能力。每两个月,婴儿会前往实验室参与眼动追踪研究(eye-tracking study),以测试其词汇理解能力(对于月龄较大的婴儿,还会测试其词汇表达能力)。 本数据集卷最终将包含13月龄家庭回访的音视频录制数据;目前仅开放视频文件,因为我们正在对音频数据进行最终清理工作(详见下文)。单日音频录制采用儿童视角LENA录制器(LENA Research Foundation,美国科罗拉多州博尔德市)完成,单条音频文件即来自一次完整的LENA录制。时长1小时的视频录制整合了1至4个摄像头的画面,以呈现婴儿的典型日常生活。标准录制设置下,婴儿会佩戴2个头戴式摄像头,同时配备一台覆盖整个房间的固定式摄像机。摄像头的具体布置方式与数量因个案而异,取决于婴儿是否愿意佩戴带摄像头的帽子,以及录制过程中摄像头文件是否出现损坏。 共享的录制数据已完成隐私脱敏处理,例如删除了全名、住址等个人信息。这一处理会导致音频轨道出现静音片段,视频画面出现黑屏时段。仅当经过人工收听验证不含极端个人隐私内容(或已完成隐私信息擦除)的文件片段,才会在此处共享。若您发现任何我们可能遗漏的隐私信息,请尽快告知我们,以便我们及时修正。 本研究的样本来自纽约州北部地区,整体属于中产阶级家庭,收入跨度各异,母亲的受教育水平高于平均水平,样本主体为白人。所有婴儿在家中主要使用英语(占比>75%),且出生时无已知的视觉或听觉障碍。本数据集由罗切斯特大学采集,目前仍在杜克大学进行后续分析。 **隐私与脱敏说明**:所有视频文件均已完成全面审核,包含个人信息的片段已被移除。这些被脱敏的视频片段会呈现为黑屏且伴随静音。部分音频文件片段已完成审核,包含个人信息的片段以及未审核的音频片段均已做静音处理。(针对8至13月龄的音频文件,我们会审核约4小时的内容以完成标注与隐私脱敏;针对14至17月龄的音频文件,审核时长约为3小时)。涵盖音视频脱敏片段时间戳的月度电子表格即将推出! 本数据集卷最近一次更新于2021年3月。若您希望进一步了解样本设计、标注与分析的相关细节,请直接联系Elika Bergelson,邮箱:elika.bergelson@gmail.com
提供机构:
Databrary
创建时间:
2018-05-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作