UCC
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/conversationai/unhealthy-conversations
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在对发布的视频进行分类,判断其内容是否为原创。在评估非原创内容分类(UCC)任务时,主要以分类的F1分数和召回率为准,尤其关注精确度阈值。该数据集的规模包括40万条用于训练和验证的视频,以及1.8万条用于测试的视频。任务目标是进行非原创内容分类。
This dataset is designed to classify published videos and determine whether their content is original. When evaluating the non-original content classification (UCC) task, the primary evaluation metrics are the classification F1-score and recall, with particular focus on the precision threshold. The dataset includes 400,000 videos for training and validation, as well as 18,000 videos for testing. The task objective is non-original content classification.
提供机构:
TikTok



