An Isolated-Signing RGBD Dataset of 100 American Sign Language Signs Produced by Fluent ASL Signers

Name: An Isolated-Signing RGBD Dataset of 100 American Sign Language Signs Produced by Fluent ASL Signers
Creator: Databrary
Published: 2021-12-26 07:11:32
License: 暂无描述

DataCite Commons2021-12-26 更新2025-04-16 收录

下载链接：

http://databrary.org/volume/1062

下载链接

链接失效反馈

官方服务：

资源简介：

The ASL-100-RBGD dataset consists of color and depth videos collected from ASL signers at the Linguistic and Assistive Technologies Laboratory under the direction of Matt Huenerfauth, as part of a collaborative research project with researchers at the Rochester Institute of Technology and the City University of New York. Access: After becoming an authorized user of Databrary, please contact Matt Huenerfauth if you have difficulty accessing this volume. We have collected a new dataset consisting of color and depth videos of fluent American Sign Language signers performing sequences of 100 ASL signs. This directed dataset had originally been collected as part of an ongoing collaborative project, to aid in the development of a sign-recognition system for identifying occurrences of these 100 signs in video. The set of words consist of vocabulary items that would commonly be learned in a first-year ASL course offered at a university, although the specific set of signs selected for inclusion in the dataset had been motivated by project-related factors. Given increasing interest among sign-recognition and other computer-vision researchers in red-green-blue-depth (RBGD) video, we release this dataset for use by the research community. In addition to the video files, we share depth data files from a Kinect v2 sensor, as well as additional motion-tracking files produced through post-processing of this data. Organization of the Dataset: The dataset is organized into sub-folders, with codenames such as "F19" or "F21" etc. These codenames refer to specific human signers who were recorded in this dataset. In some cases, a human signer may have been recorded up to three times, and in that case, there will be multiple copies of the resulting recording files within each subfolder. Task: During the recording session, the participant was met by a member of our research team who was a native ASL signer. No other individuals were present during the data collection session. The participant was presented with a sequence of videos of a native ASL signer performing each of the desired 100 signs. Participants were asked to perform a sequence of the 100 individual ASL signs, without lowering their hands between signs. Signers were encouraged to hold their hands in a comfortable neutral position in the signing space in-between each of the signs. Time permitting, we collected two to three videos per signer, with each video containing up to one production of each of the 100 ASL signs. This process yielded a total collection of 42 video files, each containing about 100 signs and approximately 4150 tokens in total. Demographics: All 22 of our participants were fluent ASL signers. As screening, we asked our participants: Did you use ASL at home growing up, or did you attend a school as a very young child where you used ASL? All the participants responded affirmatively to this question. A total of 22 DHH participants were recruited on the Rochester Institute of Technology campus. Participants included 15 men and 7 women, aged 20 to 51 (median = 23). Fifteen of our participants reported that they began using ASL when they were seven years old or younger. The remaining of the participants reported that they had been using ASL for at least 6 years and that they regularly used ASL at work or school. Filetypes: *.eaf: The videos were annotated using ELAN, using the gloss labels listed below, to indicate the start-time and stop-time of each token. At times, participants in our video recordings accidentally omitted a sign that had been requested, and at other times participants intentionally did not produce one of the requested signs. Participants in our video collection session were encouraged to produce a sign only if it is a sign that they would produce themselves; if they did not use a particular sign, e.g. due to some regional/dialectical variation, they were instructed not to skip that sign. At other times in our videos, the participant accidentally performed a different sign than the specific form requested (as shown in the stimulus video). For this reason, our team needed to watch the resulting videos carefully to ensure that the signs included in the video were the specific 100 signs that had been requested. In the case of sign productions that differed from the designed token, e.g. with the signer using a different handshape or other variation, the sign was not annotated. *.avi, *_dep.bin: The ASL-100-RGBD dataset has been captured by using a Kinect 2.0 RGBD camera. The output of this camera system includes multiple channels which include RGB, depth, skeleton joints (25 joints for every video frame), and HD face (1,347 points). The video resolution produced in 1920 x 1080 pixels for the RGB channel and 512 x 424 pixels for the depth channels respectively. Due to limitations in the acceptable filetypes for sharing on Databrary, it was not permitted to share binary *_dep.bin files directly produced by the Kinect v2 camera system on the Databrary platform. If your research requires the original binary *_dep.bin files, then please contact Matt Huenerfauth. *_face.txt, *_HDface.txt, *_skl.txt: To make it easier for future researchers to make use of this dataset, we have also performed some post-processing of the Kinect data. To extract the skeleton coordinates of the RGB videos, we used the Openpose system, which is capable of detecting body, hand, facial, and foot keypoints of multiple people on single images in real time. The output of Openpose includes estimation of 70 keypoints for the face including eyes, eyebrows, nose, mouth and face contour. The software also estimates 21 keypoints for each of the hands (Simon et al, 2017), including 3 keypoints for each finger, as shown in Figure 2. Additionally, there are 25 keypoints estimated for the body pose (and feet) (Cao et al, 2017; Wei et al, 2016). Reporting Bugs or Errors: Please contact Matt Huenerfauth to report any bugs or errors that you identify in the corpus. We appreciate your help in improving the quality of the corpus over time by identifying any errors. List of Glosses: ALWAYS CAN'T_CANNOT DODO1 DODO2 DON'T_CARE DON'T_KNOW DON'T_LIKE DON'T_MIND DON'T_WANT EIGHT_O_CLOCK1 EIGHT_O_CLOCK2 ELEVEN_O_CLOCK EVERY_AFTERNOON EVERY_DAY EVERY_FRIDAY EVERY_MONDAY EVERY_MORNING EVERY_NIGHT EVERY_SATURDAY EVERY_SUNDAY EVERY_THURSDAY EVERY_TUESDAY EVERY_WEDNESDAY FIVE_O_CLOCK1 FIVE_O_CLOCK2 FOR_FOR FOUR_O_CLOCK1 FOUR_O_CLOCK2 FRIDAY HOW1 HOW2 I_ME IF_SUPPOSE IX_HE_SHE_IT IX_THEY_THEM LAST_WEEK LAST_YEAR MIDNIGHT1 MONDAY MONTH MORNING NEVER NEXT_WEEK1 NEXT_WEEK2 NEXT_YEAR NIGHT NINE_O_CLOCK1 NINE_O_CLOCK2 NO NO_ONE NONE NOON1 NOT NOW ONE_O_CLOCK1 ONE_O_CLOCK2 PAST_PREVIOUS QMWG QUESTION RECENT SATURDAY SEVEN_O_CLOCK1 SEVEN_O_CLOCK2 SINCE_UP_TO_NOW SIX_O_CLOCK1 SIX_O_CLOCK2 SOMETIMES SOON1 SOON2 SUNDAY TEN_O_CLOCK THREE_O_CLOCK1 THREE_O_CLOCK2 THURSDAY THURSDAY2 TIME TODAY TOMORROW TONIGHT TUESDAY TWELVE_O_CLOCK TWO_O_CLOCK1 TWO_O_CLOCK2 WAVE_NO WEDNESDAY WEEK WHAT1 WHAT2 WHEN1 WHEN2 WHERE WHICH WHO1 WHO2 WHO3 WHY1 WHY2 WILL_FUTURE YESTERDAY YOU

提供机构：

Databrary

创建时间：

2020-02-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集