chimbiwide/NPC-Quest-Dialogue
收藏Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/chimbiwide/NPC-Quest-Dialogue
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-generation
language:
- en
tags:
- roleplaying
- conversational
size_categories:
- 1K<n<10K
---
# NPC-Quest-Dialogue
A successor to NPC-Dialogue_v2
---
Hunting for good quality NPC RP datasets are not an easy job, and that is why we took on the job to create high quality ones ourselves.
Our previous release of NPC-Dialogue_v2 had good feedback, so by using the same strategy we created NPC-Quest-Dialogue, a high quality dataset for NPC behavior in video games, specifially focusing on quest-related conversations.
### Dataset Statistics
Total source quests: 1,994 RPG quests
Successful generations: 1,980 conversations (99.3% success rate)
Failed/excluded: 14 (removed due to non-English output, broken role alternation, or generation failure)
Total words: 1,426,822 (~720 words per conversation)
Model used: DeepSeek v3.2 via Novita.ai (deepseek/deepseek-v3.2)
### Conversation Structure
Each conversation in rpg-quests-dialogue.jsonl contains 13–19 alternating messages (median 17), structured as:
- Message 1: system — NPC roleplay prompt (name, background, current location, quest destination)
- Message 2: user — Player greeting ("Greetings")
- Messages 3–8: Opening phase — NPC greeting, rapport building, initial context
- Messages 9–14: Development phase — core quest interaction, deeper personality, main exchange
- Messages 15+: Resolution phase — natural conclusion with future hook
### Dataset Creation
We sampled 2k rows from **[dprashar/npc_dialogue_rpg_quests](https://huggingface.co/datasets/dprashar/npc_dialogue_rpg_quests)**
Then it went through a 2 stage pipeline:
1. Generate groudning information: NPC name, NPC location description, quest location dedscription and the player description.
2. Using the grounding information along with the original quest, we generated the actual conversations with DS V3.2.
Phase 1 used a tempreture of `1.15`, and phase 2 used a tempreture of `1.3`.
For the generation scripts, check our Github Repo: **[Gemma3NPC](https://github.com/chimbiwide/Gemma3NPC/tree/main/datasets/rpg_quests)**
### Cost of Generation
As a tradition, we note our API costs for this dataset.
**Phase 1 (Description Generation)**
- Input: ~353 tokens/row → ~705K total input tokens
- Output: ~560 tokens/row → ~1.1M total output tokens
**Phase 2 (Dialogue Generation)**
- Input: ~2,057 tokens/row → ~4.1M total input tokens
- Output: ~752 tokens/row → ~1.5M total output tokens
Grand Total: ~**7.4M** tokens ( ~4.8M input, ~2.6M output)
Phase 1 costed approximately $1.50 (2 runs), and Phase 2 costed approximately $6.39 (3 runs), for a total generation cost of ~*$7.89*.
提供机构:
chimbiwide



