lubailu666/CSL-News
收藏Hugging Face2026-03-12 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/lubailu666/CSL-News
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- zh
license: cc-by-nc-4.0
tags:
- sign-language
task_categories:
- video-text-to-text
---
# Summary
This is the dataset proposed in our paper "[**Uni-Sign: Toward Unified Sign Language Understanding at Scale**](https://arxiv.org/abs/2501.15187)".
CSL-News is a large-scale Chinese Sign Language dataset designed for developing robust sign language understanding models.
**Code**: [https://github.com/ZechengLi19/Uni-Sign](https://github.com/ZechengLi19/Uni-Sign)
# Download
Please refer to [**download script**](https://github.com/ZechengLi19/Uni-Sign/blob/main/download_scripts/download_CSL_News.py) to download CSL_News.
You can also download each file by ```wget```, for instance:
```
wget https://huggingface.co/datasets/ZechengLi19/CSL-News/resolve/main/archive_001.zip
wget https://huggingface.co/datasets/ZechengLi19/CSL-News/resolve/main/archive_002.zip
wget https://huggingface.co/datasets/ZechengLi19/CSL-News/resolve/main/archive_003.zip
...
```
# Usage
You can unzip each archive_*.zip file by ```unzip```, for instance:
```
unzip -j archive_001.zip -d ./CSL_News/rgb_format
unzip -j archive_002.zip -d ./CSL_News/rgb_format
unzip -j archive_003.zip -d ./CSL_News/rgb_format
...
```
``CSL_News_Labels.json`` and ``CSL_News_Labels.csv`` contains the text-annotations for CSL-News. They can easily be read by
```python
# Read CSL_News_Labels.json
import json
with open('CSL_News_Labels.json', 'r', encoding='utf-8') as f:
data = json.load(f)
# Read CSL_News_Labels.csv
import pandas
data = pandas.read_csv("CSL_News_Labels.csv")
```
# Other format
We also provide the CSL-News dataset in a pose format. Please refer to [**here**](https://huggingface.co/datasets/ZechengLi19/CSL-News_pose).
# License
CSL-News is released under the CC-BY-NC-4.0 license. The video samples in this dataset are collected from publicly available web videos. Users must ensure that their use of these video samples is strictly non-commercial.
# Why Non-Commercial?
The video samples in CSL-News are sourced from web videos, and their copyright belongs to the original content creators. While this dataset is provided for research purposes under the CC-BY-NC-4.0 license, commercial use of these videos may infringe upon the rights of the original creators. To respect their rights and ensure ethical use, we strictly enforce a non-commercial usage policy for CSL-News.
# Citation
```
@article{li2025uni-sign,
title={Uni-Sign: Toward Unified Sign Language Understanding at Scale},
author={Li, Zecheng and Zhou, Wengang and Zhao, Weichao and Wu, Kepeng and Hu, Hezhen and Li, Houqiang},
journal={arXiv preprint arXiv:2501.15187},
year={2025}
}
```
language:
- zh
license: cc-by-nc-4.0
tags:
- 手语(sign-language)
task_categories:
- 视频-文本到文本(video-text-to-text)
---
# 概述
本数据集源自我们发表的论文《**Uni-Sign:面向规模化统一手语理解**》(https://arxiv.org/abs/2501.15187)。CSL-News是一款大规模中文手语数据集,旨在助力构建鲁棒的手语理解模型。
**代码**:[https://github.com/ZechengLi19/Uni-Sign](https://github.com/ZechengLi19/Uni-Sign)
# 下载
请参考**下载脚本**(https://github.com/ZechengLi19/Uni-Sign/blob/main/download_scripts/download_CSL_News.py)获取CSL-News数据集。您也可通过`wget`命令下载单个文件,示例如下:
wget https://huggingface.co/datasets/ZechengLi19/CSL-News/resolve/main/archive_001.zip
wget https://huggingface.co/datasets/ZechengLi19/CSL-News/resolve/main/archive_002.zip
wget https://huggingface.co/datasets/ZechengLi19/CSL-News/resolve/main/archive_003.zip
...
# 使用方法
您可使用`unzip`命令解压每个`archive_*.zip`压缩包,示例如下:
unzip -j archive_001.zip -d ./CSL_News/rgb_format
unzip -j archive_002.zip -d ./CSL_News/rgb_format
unzip -j archive_003.zip -d ./CSL_News/rgb_format
...
`CSL_News_Labels.json`与`CSL_News_Labels.csv`存储了CSL-News的文本标注信息,可通过以下代码轻松读取:
python
# 读取 CSL_News_Labels.json
import json
with open('CSL_News_Labels.json', 'r', encoding='utf-8') as f:
data = json.load(f)
# 读取 CSL_News_Labels.csv
import pandas
data = pandas.read_csv("CSL_News_Labels.csv")
# 其他格式
我们还提供了姿态格式的CSL-News数据集,请前往**此处**(https://huggingface.co/datasets/ZechengLi19/CSL-News_pose)获取。
# 许可证
CSL-News采用CC-BY-NC-4.0许可证开源。本数据集的视频样本均采集自公开网络视频,使用者必须确保对这些视频样本的使用严格遵循非商业用途要求。
# 为何采用非商业许可证?
CSL-News的视频样本源自网络公开视频,其版权归原内容创作者所有。尽管本数据集基于CC-BY-NC-4.0许可证面向科研用途发布,但商业使用这些视频可能会侵犯原创作者的合法权益。为尊重原创作者权益并保障合规使用,我们对CSL-News严格执行非商业使用政策。
# 引用
@article{li2025uni-sign,
title={Uni-Sign: Toward Unified Sign Language Understanding at Scale},
author={Li, Zecheng and Zhou, Wengang and Zhao, Weichao and Wu, Kepeng and Hu, Hezhen and Li, Houqiang},
journal={arXiv preprint arXiv:2501.15187},
year={2025}
}
提供机构:
lubailu666



