amuvarma/dualcodec-face-tokenised-joined
收藏Hugging Face2025-10-03 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/amuvarma/dualcodec-face-tokenised-joined
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含视频相关的信息,如视频ID、唯一标识符、视频片段、令牌、语义代码、转录文本以及多个类别标签(cb_0到cb_6)。每个视频片段可能对应多个令牌和语义代码,以及多个类别标签。数据集分为训练集,共有超过428万示例,数据集大小约为27.7GB。
The dataset includes video-related information such as video ID, unique identifier, video segment, tokens, semantic codes, transcription text, and multiple category labels (cb_0 to cb_6). Each video segment may correspond to multiple tokens, semantic codes, and category labels. The dataset is split into a training set with over 4.28 million examples, and the dataset size is approximately 27.7GB.
提供机构:
amuvarma



