中文裁判文书法院观点数据集
收藏魔搭社区2026-05-20 更新2025-10-11 收录
下载链接:
https://modelscope.cn/datasets/ZhitianHou/CCVG
下载链接
链接失效反馈官方服务:
资源简介:
# CCVG: Chinese Court View Generation Dataset
[中文](README_zh.md) | 🤗 [huggingface](https://huggingface.co/datasets/TIM0927/CCVG) | 🤖 [modelscope](https://www.modelscope.cn/datasets/ZhitianHou/CCVG) | 📄 [Arxiv](https://arxiv.org/abs/2510.09297) | 💻 [GitHub](https://github.com/ZhitianHou/ShiZhi)
**Now the anonymous test data has been released.**
For privacy protection reasons, the original training data has been removed pending de-identification and anonymization. Only some examples of training data are avaiable.
**CCVG** is a curated Chinese dataset designed for **Criminal Court View Generation (CVG)** and **charge prediction** tasks.
It contains criminal case documents with **fact descriptions** and **court views**, supporting research in legal AI and natural language generation.
---
## Dataset Overview
- **Language:** Chinese
- **Domain:** Legal / Criminal law
- **Task:** Court View Generation & Charge Prediction
- **Size:** 110K+ criminal cases
- **Time Span:** 1985–2021
Each case contains:
1. **system**: The system prompt we used in Training.
2. **query:** The "Fact" section summarying the case fact in legal case documents.
3. **response:** The "Court View" section explaining the judgment in legal case documents.
4. **charge:** The criminal charge corresponding to the case.
## Citation
If you find this project helpful, please consider citing our paper:
```bibtex
@misc{hou2025shizhichineselightweightlarge,
title={ShiZhi: A Chinese Lightweight Large Language Model for Court View Generation},
author={Zhitian Hou and Kun Zeng},
year={2025},
eprint={2510.09297},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.09297},
}
```
# CCVG:中国法院裁判观点生成数据集
[中文文档](README_zh.md) | 🤗 [Hugging Face](https://huggingface.co/datasets/TIM0927/CCVG) | 🤖 [ModelScope](https://www.modelscope.cn/datasets/ZhitianHou/CCVG) | 📄 [Arxiv论文](https://arxiv.org/abs/2510.09297) | 💻 [GitHub仓库](https://github.com/ZhitianHou/ShiZhi)
**目前匿名测试数据集已正式发布。**
出于隐私保护的相关要求,原始训练数据集已暂予移除,待完成去标识化与匿名化处理;当前仅开放部分训练数据示例供参考。
**CCVG**是一款精心整理的中文数据集,专为**刑事法院裁判观点生成(Criminal Court View Generation, CVG)**与**罪名预测**两类任务打造。该数据集收录了包含**事实描述**与**法院裁判观点**的刑事案卷文档,可为法律人工智能与自然语言生成领域的相关研究提供支撑。
---
## 数据集概览
- **语言:** 中文
- **领域:** 法律 / 刑法
- **任务:** 法院裁判观点生成与罪名预测
- **规模:** 11万+ 刑事案例
- **时间跨度:** 1985–2021
每个案例包含以下内容:
1. **system**:训练过程中所使用的系统提示词。
2. **query**:刑事案卷文档中用于概括案件事实的“事实”章节内容。
3. **response**:刑事案卷文档中用于阐释裁判理由与结果的“法院裁判观点”章节内容。
4. **charge**:对应案件的刑事罪名。
---
## 引用说明
若本项目对您的研究有所帮助,请引用我们的论文:
bibtex
@misc{hou2025shizhichineselightweightlarge,
title={ShiZhi: A Chinese Lightweight Large Language Model for Court View Generation},
author={Zhitian Hou and Kun Zeng},
year={2025},
eprint={2510.09297},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.09297},
}
提供机构:
maas
创建时间:
2025-10-10
搜集汇总
数据集介绍

背景与挑战
背景概述
CCVG是一个专门用于刑事法院观点生成和罪名预测任务的中文数据集,包含超过11万条刑事案例,每个案例都提供案件事实描述和法院观点。由于隐私保护,原始训练数据已移除,仅保留部分示例。
以上内容由遇见数据集搜集并总结生成



