five

chimbiwide/NPC-Quest-Dialogue

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/chimbiwide/NPC-Quest-Dialogue
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation language: - en tags: - roleplaying - conversational size_categories: - 1K<n<10K --- # NPC-Quest-Dialogue A successor to NPC-Dialogue_v2 --- Hunting for good quality NPC RP datasets are not an easy job, and that is why we took on the job to create high quality ones ourselves. Our previous release of NPC-Dialogue_v2 had good feedback, so by using the same strategy we created NPC-Quest-Dialogue, a high quality dataset for NPC behavior in video games, specifially focusing on quest-related conversations. ### Dataset Statistics Total source quests: 1,994 RPG quests Successful generations: 1,980 conversations (99.3% success rate) Failed/excluded: 14 (removed due to non-English output, broken role alternation, or generation failure) Total words: 1,426,822 (~720 words per conversation) Model used: DeepSeek v3.2 via Novita.ai (deepseek/deepseek-v3.2) ### Conversation Structure Each conversation in rpg-quests-dialogue.jsonl contains 13–19 alternating messages (median 17), structured as: - Message 1: system — NPC roleplay prompt (name, background, current location, quest destination) - Message 2: user — Player greeting ("Greetings") - Messages 3–8: Opening phase — NPC greeting, rapport building, initial context - Messages 9–14: Development phase — core quest interaction, deeper personality, main exchange - Messages 15+: Resolution phase — natural conclusion with future hook ### Dataset Creation We sampled 2k rows from **[dprashar/npc_dialogue_rpg_quests](https://huggingface.co/datasets/dprashar/npc_dialogue_rpg_quests)** Then it went through a 2 stage pipeline: 1. Generate groudning information: NPC name, NPC location description, quest location dedscription and the player description. 2. Using the grounding information along with the original quest, we generated the actual conversations with DS V3.2. Phase 1 used a tempreture of `1.15`, and phase 2 used a tempreture of `1.3`. For the generation scripts, check our Github Repo: **[Gemma3NPC](https://github.com/chimbiwide/Gemma3NPC/tree/main/datasets/rpg_quests)** ### Cost of Generation As a tradition, we note our API costs for this dataset. **Phase 1 (Description Generation)** - Input: ~353 tokens/row → ~705K total input tokens - Output: ~560 tokens/row → ~1.1M total output tokens **Phase 2 (Dialogue Generation)** - Input: ~2,057 tokens/row → ~4.1M total input tokens - Output: ~752 tokens/row → ~1.5M total output tokens Grand Total: ~**7.4M** tokens ( ~4.8M input, ~2.6M output) Phase 1 costed approximately $1.50 (2 runs), and Phase 2 costed approximately $6.39 (3 runs), for a total generation cost of ~*$7.89*.
提供机构:
chimbiwide
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作