five

brandburner/startrektng-mega-narrative-kg

收藏
Hugging Face2026-03-14 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/brandburner/startrektng-mega-narrative-kg
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 task_categories: - graph-ml - text-generation tags: - narrative - knowledge-graph - screenplay - fabula - neo4j - graph-gravity size_categories: - 1K<n<10K --- # Star Trek The Next Generation - Narrative Knowledge Graph A rich narrative knowledge graph extracted from *Star Trek The Next Generation* screenplays using the [Fabula](https://fabula.productions) pipeline. Contains characters, locations, objects, organizations, events, themes, and conflict arcs with full participation semantics and Graph Gravity importance tiers. ## Dataset Overview | Metric | Value | |--------|-------| | Source database | `startrektng.mega` | | Type | Megagraph (cross-season merged) | | Episodes | 177 | | Seasons merged | 1, 2, 3, 4, 5, 6, 7 | | Total nodes | 50,646 | | Total edges | 134,915 | | Schema version | 1.1.0 | | Exported | 2026-03-14 | ### Entity Breakdown | Type | Count | |------|-------| | Act | 935 | | Agent | 2,102 | | ConflictArc | 588 | | Episode | 177 | | Event | 9,333 | | Location | 2,464 | | Object | 8,364 | | Organization | 790 | | PlotBeat | 20,286 | | SceneBoundary | 5,378 | | Theme | 95 | | Writer | 134 | ### Graph Gravity Tiers | Tier | Count | Description | |------|-------|-------------| | anchor | 306 | Main characters / key locations | | planet | 860 | Recurring entities | | asteroid | 12,554 | Minor / one-off entities | ### Relationship Types `AFFILIATED_WITH`, `BELONGS_TO_EPISODE`, `CALLBACK`, `CAUSAL`, `CHARACTER_CONTINUITY`, `CONTAINS_BEAT`, `CREDITED_ON`, `EMOTIONAL_ECHO`, `ESCALATION`, `EXEMPLIFIES_THEME`, `FORESHADOWING`, `INVOLVED_IN_ARC`, `INVOLVED_WITH`, `IN_EVENT`, `NARRATIVELY_FOLLOWS`, `OCCURS_IN`, `OWNS`, `PARTICIPATED_AS`, `PART_OF`, `PART_OF_ACT` ... and 7 more ## Megagraph vs. Season Datasets This is a **megagraph** — a cross-season unified knowledge graph, not a simple concatenation of per-season datasets. Key differences from individual season datasets: - **Unified entity identities**: Cross-season entities (recurring characters, locations, organizations) are reconciled through a Global Entity Registry (GER) and assigned new canonical UUIDs. The same character will have a *different* `node_id` here than in any individual season dataset. - **Distilled descriptions**: Entity descriptions may be rewritten during GER reconciliation to reflect a character's full arc rather than a single season's perspective. - **Cross-season Graph Gravity**: Tier assignments (anchor/planet/asteroid) reflect importance across all 177 episodes. An entity that is "planet" tier in one season may become "anchor" in the megagraph because they recur across multiple seasons. - **Season-unique entities preserved**: Entities appearing in only one season are transferred with their original UUIDs and properties. - **Cross-season relationship topology**: The megagraph contains participation and narrative connection patterns that span season boundaries. **For single-season analysis**, use the individual season datasets: - [Season 1](https://huggingface.co/datasets/brandburner/startrektng-s01-narrative-kg) - [Season 2](https://huggingface.co/datasets/brandburner/startrektng-s02-narrative-kg) - [Season 3](https://huggingface.co/datasets/brandburner/startrektng-s03-narrative-kg) - [Season 4](https://huggingface.co/datasets/brandburner/startrektng-s04-narrative-kg) - [Season 5](https://huggingface.co/datasets/brandburner/startrektng-s05-narrative-kg) - [Season 6](https://huggingface.co/datasets/brandburner/startrektng-s06-narrative-kg) - [Season 7](https://huggingface.co/datasets/brandburner/startrektng-s07-narrative-kg) **For cross-season analysis** (character arcs, thematic evolution, entity importance across the full series), use this megagraph. ## Files | File | Description | |------|-------------| | `nodes.parquet` | All graph nodes with properties | | `edges.parquet` | All relationships with properties | | `positions.parquet` | 3D layout coordinates for visualization | | `meta.json` | Dataset metadata and entity counts | ## Schema ### Nodes (`nodes.parquet`) | Column | Type | Description | |--------|------|-------------| | `node_id` | string | Unique node identifier (UUID) | | `primary_label` | string | Node type (Agent, Location, Event, etc.) | | `name` | string | Display name | | `description` | string | Foundational description | | `tier` | string (nullable) | Graph Gravity tier: anchor / planet / asteroid | | `episode_count` | int (nullable) | Number of distinct episodes entity appears in | | `first_episode_seq` | int (nullable) | First appearance episode | | `last_episode_seq` | int (nullable) | Last appearance episode | | `properties_json` | string | Full node properties as JSON | ### Edges (`edges.parquet`) | Column | Type | Description | |--------|------|-------------| | `source_node_id` | string | Source node UUID | | `target_node_id` | string | Target node UUID | | `relationship_type` | string | Relationship type (e.g., PARTICIPATED_AS) | | `properties_json` | string | Edge properties as JSON | ### Positions (`positions.parquet`) | Column | Type | Description | |--------|------|-------------| | `node_id` | string | Node UUID | | `x`, `y`, `z` | float | 3D coordinates | | `size` | float | Node size (Graph Gravity weighted) | | `r`, `g`, `b` | int | RGB color by entity type | | `community` | int | Louvain community index | | `tier` | string (nullable) | Graph Gravity tier | ## Usage ```python from datasets import load_dataset import pandas as pd # Load from HuggingFace ds = load_dataset("brandburner/startrektng-mega-narrative-kg") # Or load parquet directly nodes = pd.read_parquet("nodes.parquet") edges = pd.read_parquet("edges.parquet") # Filter to anchor characters anchors = nodes[(nodes['primary_label'] == 'Agent') & (nodes['tier'] == 'anchor')] # Build a NetworkX graph import networkx as nx G = nx.DiGraph() for _, n in nodes.iterrows(): G.add_node(n['node_id'], label=n['primary_label'], name=n['name']) for _, e in edges.iterrows(): G.add_edge(e['source_node_id'], e['target_node_id'], type=e['relationship_type']) ``` ## Citation ```bibtex @misc{fabula_startrektng_mega, title = {Star Trek The Next Generation Narrative Knowledge Graph}, author = {Fabula Pipeline}, year = {2026}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/datasets/brandburner/startrektng-mega-narrative-kg}} } ``` ## License CC BY-SA 4.0
提供机构:
brandburner
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作