CharcardCodex
收藏魔搭社区2026-01-02 更新2025-03-29 收录
下载链接:
https://modelscope.cn/datasets/Gryphe/CharcardCodex
下载链接
链接失效反馈官方服务:
资源简介:
⚠️ **WARNING** ⚠️ This dataset is an attempt at creating a semblence of order in the madness that are character roleplaying cards. Many of these are highly NSFW in nature and may potentially describe disturbing scenes. **Consider yourself very thoroughly warned!**
# So, what is CharcardCodex?
The main purpose of this dataset is to serve as a source of original human-created ideas, far outside the usual comfort zone that a language model would provide if prompted for inspiration. By incorporating additional metrics (as described below) one can filter the possible situations to then use for their own dataset creation pipelines.
This collection was created by having an unrestrained Opus analyze over 50.000 roleplaying character cards from various sources, filtering out the more extreme cases and deduping them heavily before finally enriching them with more metrics, alongside a story creation prompt.
Note that there are still plenty of duplicate characters in this collection as they may feature in wildly different scenarios. I also make no guarantees that despite the thorough review I performed each entry is 100% correct.
## Field descriptions
**id** - Internal identifier
**type** - Either 'CHARACTER' (a single character) or 'GROUP' (Multiple characters)
**name** - Name of the character(s)
**species** - Species of the character(s)
**gender** - Gender of the character(s)
**age** - Age of the character(s)
**appearance** - A detailed appearance of the character(s)
**personality** - A personality description of the character(s)
**setting** - A description of the setting in which the scenario takes place
**scenario** - A description of the scenario, either extracted from the card or deducted based on the available information
**notes** - Important details that do not fit any of the other fields
**objective** - What is the most likely objective of the user interacting with it?
**user_role** - What role is the user playing in this scenario?
**nsfw_level** - Can be either NONE/LOW/MEDIUM/HIGH
```
NONE = This scenario has no focus on sex or actively discourages it.
LOW = This scenario might turn sexual if the user puts effort into it.
MEDIUM - The character is open to sexual advances and would not require much persuasion.
HIGH - The character and the user are already involved in sex, seconds away from it, or just had sex.
```
**fetishes** - An optional list of the three most prominent fetishes featuring in this entry
**story_prompt** - A very detailed instructional prompt telling a language model to write a story about this. Ambigious user role entries are replaced with a fictional identity. There was a high variance in how explicit Opus wanted each story to be, which I kept on purpose to promote a varied dataset if one were to use them directly.
## Credits
All those wonderful roleplay character card creators for making this dataset possible. Some of you made me feel emotional, and some of you scarred me for life. I have no regrets either way.
## Feedback
If you find any strange formatting errors, please let me know! I'll see what I can do to fix them.
⚠️ **警告** ⚠️ 本数据集旨在为混乱不堪的角色扮演角色卡牌(character roleplaying cards)梳理出有序的雏形。其中多数内容属于高度不适合工作场所(Not Safe For Work,NSFW)范畴,可能涉及令人不适的场景,请务必知悉相关风险!
## 什么是CharcardCodex?
本数据集的核心目标是提供源自人类原创的创意素材,突破语言模型在常规灵感提示下的舒适边界。通过加入下述额外指标,使用者可对潜在场景进行筛选,进而将其应用于自有数据集构建流程。
本数据集通过无约束的Opus分析了来自各类渠道的5万余张角色扮演角色卡牌,过滤极端案例并进行多重重去重后,额外添加多项指标与故事生成提示词,最终完成内容富集。
需注意,本数据集仍存在大量重复角色条目——同一角色可能出现在迥异的场景中。此外,尽管经过全面审核,我无法保证每条条目均100%准确无误。
## 字段说明
**id** - 内部标识符
**type** - 取值为'CHARACTER'(单角色)或'GROUP'(多角色)
**name** - 角色(群)名称
**species** - 角色(群)种族
**gender** - 角色(群)性别
**age** - 角色(群)年龄
**appearance** - 角色(群)详细外貌描述
**personality** - 角色(群)性格描述
**setting** - 该场景所处的背景设定
**scenario** - 场景描述,可源自卡牌本身或基于现有信息推导得出
**notes** - 无法归入其他类别的重要细节
**objective** - 与该角色交互的用户最可能达成的目标
**user_role** - 用户在该场景中扮演的角色
**nsfw_level** - 分级为NONE/LOW/MEDIUM/HIGH:
NONE = 本场景无性暗示内容,且明确反对相关行为。
LOW = 若用户主动引导,场景可能转向性相关内容。
MEDIUM = 角色对性挑逗持开放态度,无需过多劝说即可接受。
HIGH = 角色与用户已处于性接触状态、即将发生性行为或刚完成性行为。
**fetishes** - 可选字段,列出本条目内最突出的3种性偏好
**story_prompt** - 详细的指令式提示词,用于指导大语言模型(Large Language Model)撰写对应故事。对于角色不明确的用户角色条目,已替换为虚构身份。由于Opus生成的各故事提示词的直白程度差异较大,为保证数据集多样性,我保留了这一特性,供使用者直接使用。
## 致谢
本数据集的完成得益于所有出色的角色扮演卡牌创作者,部分作品令我动容,部分则给我留下了深刻的印象,无论如何我都无怨无悔。
## 反馈
若发现任何格式异常问题,欢迎告知!我会尽力修复。
提供机构:
maas
创建时间:
2025-03-27



