five

ChatHaruhi-54K-Role-Playing-Dialogue

收藏
魔搭社区2026-04-27 更新2025-07-12 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/ChatHaruhi-54K-Role-Playing-Dialogue
下载链接
链接失效反馈
官方服务:
资源简介:
# ChatHaruhi # Reviving Anime Character in Reality via Large Language Model [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)]() [![Data License](https://img.shields.io/badge/Data%20License-CC%20By%20NC%204.0-red.svg)]() github repo: https://github.com/LC1332/Chat-Haruhi-Suzumiya **Chat-Haruhi-Suzumiya**is a language model that imitates the tone, personality and storylines of characters like Haruhi Suzumiya, <details> <summary> The project was developed by Cheng Li, Ziang Leng, Chenxi Yan, Xiaoyang Feng, HaoSheng Wang, Junyi Shen, Hao Wang, Weishi Mi, Aria Fei, Song Yan, Linkang Zhan, Yaokai Jia, Pingyu Wu, and Haozhen Sun,etc. </summary> This is an open source project and the members were recruited from open source communities like DataWhale. Lulu Li( [Cheng Li@SenseTime](https://github.com/LC1332) )initiated the whole project and designed and implemented most of the features. Ziang Leng( [Ziang Leng@SenseTime](https://blairleng.github.io) )designed and implemented the training, data generation and backend architecture for ChatHaruhi 1.0. Chenxi Yan( [Chenxi Yan@Chengdu University of Information Technology](https://github.com/todochenxi) )implemented and maintained the backend for ChatHaruhi 1.0. Junyi Shen( [Junyi Shen@Zhejiang University](https://github.com/J1shen) )implemented the training code and participated in generating the training dataset. Hao Wang( [Hao Wang](https://github.com/wanghao07456) )collected script data for a TV series and participated in data augmentation. Weishi Mi( [Weishi MI@Tsinghua University](https://github.com/hhhwmws0117) )participated in data augmentation. Aria Fei( [Aria Fei@BJUT](https://ariafyy.github.io/) )implemented the ASR feature for the script tool and participated in the Openness-Aware Personality paper project. Xiaoyang Feng( [Xiaoyang Feng@Nanjing Agricultural University](https://github.com/fengyunzaidushi) )integrated the script recognition tool and participated in the Openness-Aware Personality paper project. Yue Leng ( [Song Yan](https://github.com/zealot52099) )Collected data from The Big Bang Theory. Implemented script format conversion. scixing(HaoSheng Wang)( [HaoSheng Wang](https://github.com/ssccinng) ) implemented voiceprint recognition in the script tool and tts-vits speech synthesis. Linkang Zhan( [JunityZhan@Case Western Reserve University](https://github.com/JunityZhan) ) collected Genshin Impact's system prompts and story data. Yaokai Jia( [Yaokai Jia](https://github.com/KaiJiaBrother) )implemented the Vue frontend and practiced GPU extraction of Bert in a psychology project. Pingyu Wu( [Pingyu Wu@Juncai Shuyun](https://github.com/wpydcr) )helped deploy the first version of the training code. Haozhen Sun( [Haozhen Sun@Tianjin University] )plot the character figures for ChatHaruhi. </details> ## transfer into input-target format If you want to convert this data into an input-output format check the link here https://huggingface.co/datasets/silk-road/ChatHaruhi-Expand-118K ### Citation Please cite the repo if you use the data or code in this repo. ``` @misc{li2023chatharuhi, title={ChatHaruhi: Reviving Anime Character in Reality via Large Language Model}, author={Cheng Li and Ziang Leng and Chenxi Yan and Junyi Shen and Hao Wang and Weishi MI and Yaying Fei and Xiaoyang Feng and Song Yan and HaoSheng Wang and Linkang Zhan and Yaokai Jia and Pingyu Wu and Haozhen Sun}, year={2023}, eprint={2308.09597}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```

# ChatHaruhi # 基于大语言模型在现实中还原动画角色 [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)]() [![Data License](https://img.shields.io/badge/Data%20License-CC%20By%20NC%204.0-red.svg)]() github仓库地址:https://github.com/LC1332/Chat-Haruhi-Suzumiya **Chat-Haruhi-Suzumiya**是一款能够模仿凉宫春日(Haruhi Suzumiya)等动画角色的语气、性格与剧情风格的语言模型。 <details> <summary> 本项目由程立、冷梓昂、严晨希、冯晓阳、王浩升、沈俊毅、王浩、米维石、费雅莹、宋岩、詹理康、贾尧凯、吴平宇以及孙浩臻等开发。 </summary> 本项目为开源项目,核心成员均招募自DataWhale等开源社区。 李璐(Lulu Li,[程立@商汤科技(SenseTime)](https://github.com/LC1332))发起本项目并设计实现了绝大多数功能。 冷梓昂([冷梓昂@商汤科技(SenseTime)](https://blairleng.github.io))设计并实现了ChatHaruhi 1.0的训练、数据生成与后端架构。 严晨希([严晨希@成都信息工程大学](https://github.com/todochenxi))实现并维护了ChatHaruhi 1.0的后端服务。 沈俊毅([沈俊毅@浙江大学](https://github.com/J1shen))实现了训练代码并参与了训练数据集的生成工作。 王浩([王浩](https://github.com/wanghao07456))收集了电视剧脚本数据并参与了数据增强工作。 米维石([米维石@清华大学](https://github.com/hhhwmws0117))参与了数据增强工作。 费雅莹(Aria Fei,[费雅莹@北京工业大学](https://ariafyy.github.io/))实现了脚本工具的自动语音识别(ASR, Automatic Speech Recognition)功能,并参与了《开放感知性格》相关论文项目。 冯晓阳([冯晓阳@南京农业大学](https://github.com/fengyunzaidushi))集成了脚本识别工具,并参与了《开放感知性格》相关论文项目。 宋岩(Yue Leng,[宋岩](https://github.com/zealot52099))收集了《生活大爆炸》(The Big Bang Theory)的相关数据,实现了脚本格式转换功能。 王浩升(scixing,[王浩升](https://github.com/ssccinng))实现了脚本工具中的声纹识别与TTS-VITS语音合成功能。 詹理康([詹理康@凯斯西储大学](https://github.com/JunityZhan))收集了《原神》(Genshin Impact)的系统提示词与剧情数据。 贾尧凯([贾尧凯](https://github.com/KaiJiaBrother))实现了Vue前端框架,并在心理学项目中实践了Bert模型的GPU加速提取。 吴平宇([吴平宇@俊才书云](https://github.com/wpydcr))协助部署了首个版本的训练代码。 孙浩臻([孙浩臻@天津大学](https://github.com/haozhensun))为ChatHaruhi绘制了角色形象图。 </details> ## 转换为输入-输出格式 若您希望将该数据集转换为输入-输出(input-target)格式,请参考以下链接: https://huggingface.co/datasets/silk-road/ChatHaruhi-Expand-118K ### 引用 若您在研究中使用本项目的代码或数据集,请引用本仓库: @misc{li2023chatharuhi, title={ChatHaruhi:基于大语言模型在现实中还原动画角色}, author={Cheng Li and Ziang Leng and Chenxi Yan and Junyi Shen and Hao Wang and Weishi MI and Yaying Fei and Xiaoyang Feng and Song Yan and HaoSheng Wang and Linkang Zhan and Yaokai Jia and Pingyu Wu and Haozhen Sun}, year={2023}, eprint={2308.09597}, archivePrefix={arXiv}, primaryClass={cs.CL} }
提供机构:
maas
创建时间:
2025-07-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作