Narrative Motif Engine — corpus, generated stories, and embeddings

Name: Narrative Motif Engine — corpus, generated stories, and embeddings
Creator: Zenodo
Published: 2026-05-04 09:05:48
License: 暂无描述

DataCite Commons2026-05-04 更新2026-05-07 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.20010023

下载链接

链接失效反馈

官方服务：

资源简介：

Companion dataset for the Narrative Motif Engine codebase (github.com/IanMcGarryUL/Narrative-Motif-Engine). Contains the original corpus (970 stories, 9,700 thematic propositions), 106,699 LLM-generated stylistic variations of those propositions (each with a 768-d embedding), 100 pipeline-generated stories + 100 control-generated stories with their canonical (style-erased) embeddings and 22,000 stylistic variations, three independent passes of style-erased canonical vectors for the original corpus, hierarchical and k-NN cluster assignments at k=72, Kernel-PCA scores (PC1-PC16), and geographical metadata mapping countries to regions. 17 CSV files, ~4.3 GB total. plus 2 optional PostgreSQL dumps (~1.4 GB combined). All embeddings stored as JSON-encoded strings for portability. See docs/dataset_card.md in the codebase for the full file manifest, generation methodology, and reproduction steps.

提供机构：

Zenodo

创建时间：

2026-05-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集