brandburner/startrektng-mega-narrative-kg
收藏Hugging Face2026-03-14 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/brandburner/startrektng-mega-narrative-kg
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
task_categories:
- graph-ml
- text-generation
tags:
- narrative
- knowledge-graph
- screenplay
- fabula
- neo4j
- graph-gravity
size_categories:
- 1K<n<10K
---
# Star Trek The Next Generation - Narrative Knowledge Graph
A rich narrative knowledge graph extracted from *Star Trek The Next Generation* screenplays using the
[Fabula](https://fabula.productions) pipeline. Contains characters,
locations, objects, organizations, events, themes, and conflict arcs with full
participation semantics and Graph Gravity importance tiers.
## Dataset Overview
| Metric | Value |
|--------|-------|
| Source database | `startrektng.mega` |
| Type | Megagraph (cross-season merged) |
| Episodes | 177 |
| Seasons merged | 1, 2, 3, 4, 5, 6, 7 |
| Total nodes | 50,646 |
| Total edges | 134,915 |
| Schema version | 1.1.0 |
| Exported | 2026-03-14 |
### Entity Breakdown
| Type | Count |
|------|-------|
| Act | 935 |
| Agent | 2,102 |
| ConflictArc | 588 |
| Episode | 177 |
| Event | 9,333 |
| Location | 2,464 |
| Object | 8,364 |
| Organization | 790 |
| PlotBeat | 20,286 |
| SceneBoundary | 5,378 |
| Theme | 95 |
| Writer | 134 |
### Graph Gravity Tiers
| Tier | Count | Description |
|------|-------|-------------|
| anchor | 306 | Main characters / key locations |
| planet | 860 | Recurring entities |
| asteroid | 12,554 | Minor / one-off entities |
### Relationship Types
`AFFILIATED_WITH`, `BELONGS_TO_EPISODE`, `CALLBACK`, `CAUSAL`, `CHARACTER_CONTINUITY`, `CONTAINS_BEAT`, `CREDITED_ON`, `EMOTIONAL_ECHO`, `ESCALATION`, `EXEMPLIFIES_THEME`, `FORESHADOWING`, `INVOLVED_IN_ARC`, `INVOLVED_WITH`, `IN_EVENT`, `NARRATIVELY_FOLLOWS`, `OCCURS_IN`, `OWNS`, `PARTICIPATED_AS`, `PART_OF`, `PART_OF_ACT` ... and 7 more
## Megagraph vs. Season Datasets
This is a **megagraph** — a cross-season unified knowledge graph, not a simple concatenation of per-season datasets.
Key differences from individual season datasets:
- **Unified entity identities**: Cross-season entities (recurring characters, locations, organizations) are reconciled through a Global Entity Registry (GER) and assigned new canonical UUIDs. The same character will have a *different* `node_id` here than in any individual season dataset.
- **Distilled descriptions**: Entity descriptions may be rewritten during GER reconciliation to reflect a character's full arc rather than a single season's perspective.
- **Cross-season Graph Gravity**: Tier assignments (anchor/planet/asteroid) reflect importance across all 177 episodes. An entity that is "planet" tier in one season may become "anchor" in the megagraph because they recur across multiple seasons.
- **Season-unique entities preserved**: Entities appearing in only one season are transferred with their original UUIDs and properties.
- **Cross-season relationship topology**: The megagraph contains participation and narrative connection patterns that span season boundaries.
**For single-season analysis**, use the individual season datasets:
- [Season 1](https://huggingface.co/datasets/brandburner/startrektng-s01-narrative-kg)
- [Season 2](https://huggingface.co/datasets/brandburner/startrektng-s02-narrative-kg)
- [Season 3](https://huggingface.co/datasets/brandburner/startrektng-s03-narrative-kg)
- [Season 4](https://huggingface.co/datasets/brandburner/startrektng-s04-narrative-kg)
- [Season 5](https://huggingface.co/datasets/brandburner/startrektng-s05-narrative-kg)
- [Season 6](https://huggingface.co/datasets/brandburner/startrektng-s06-narrative-kg)
- [Season 7](https://huggingface.co/datasets/brandburner/startrektng-s07-narrative-kg)
**For cross-season analysis** (character arcs, thematic evolution, entity importance across the full series), use this megagraph.
## Files
| File | Description |
|------|-------------|
| `nodes.parquet` | All graph nodes with properties |
| `edges.parquet` | All relationships with properties |
| `positions.parquet` | 3D layout coordinates for visualization |
| `meta.json` | Dataset metadata and entity counts |
## Schema
### Nodes (`nodes.parquet`)
| Column | Type | Description |
|--------|------|-------------|
| `node_id` | string | Unique node identifier (UUID) |
| `primary_label` | string | Node type (Agent, Location, Event, etc.) |
| `name` | string | Display name |
| `description` | string | Foundational description |
| `tier` | string (nullable) | Graph Gravity tier: anchor / planet / asteroid |
| `episode_count` | int (nullable) | Number of distinct episodes entity appears in |
| `first_episode_seq` | int (nullable) | First appearance episode |
| `last_episode_seq` | int (nullable) | Last appearance episode |
| `properties_json` | string | Full node properties as JSON |
### Edges (`edges.parquet`)
| Column | Type | Description |
|--------|------|-------------|
| `source_node_id` | string | Source node UUID |
| `target_node_id` | string | Target node UUID |
| `relationship_type` | string | Relationship type (e.g., PARTICIPATED_AS) |
| `properties_json` | string | Edge properties as JSON |
### Positions (`positions.parquet`)
| Column | Type | Description |
|--------|------|-------------|
| `node_id` | string | Node UUID |
| `x`, `y`, `z` | float | 3D coordinates |
| `size` | float | Node size (Graph Gravity weighted) |
| `r`, `g`, `b` | int | RGB color by entity type |
| `community` | int | Louvain community index |
| `tier` | string (nullable) | Graph Gravity tier |
## Usage
```python
from datasets import load_dataset
import pandas as pd
# Load from HuggingFace
ds = load_dataset("brandburner/startrektng-mega-narrative-kg")
# Or load parquet directly
nodes = pd.read_parquet("nodes.parquet")
edges = pd.read_parquet("edges.parquet")
# Filter to anchor characters
anchors = nodes[(nodes['primary_label'] == 'Agent') & (nodes['tier'] == 'anchor')]
# Build a NetworkX graph
import networkx as nx
G = nx.DiGraph()
for _, n in nodes.iterrows():
G.add_node(n['node_id'], label=n['primary_label'], name=n['name'])
for _, e in edges.iterrows():
G.add_edge(e['source_node_id'], e['target_node_id'], type=e['relationship_type'])
```
## Citation
```bibtex
@misc{fabula_startrektng_mega,
title = {Star Trek The Next Generation Narrative Knowledge Graph},
author = {Fabula Pipeline},
year = {2026},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/datasets/brandburner/startrektng-mega-narrative-kg}}
}
```
## License
CC BY-SA 4.0
提供机构:
brandburner



