five

brandburner/doctorwho-s04-narrative-kg

收藏
Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/brandburner/doctorwho-s04-narrative-kg
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 task_categories: - graph-ml - text-generation tags: - narrative - knowledge-graph - screenplay - fabula - neo4j - graph-gravity size_categories: - 1K<n<10K --- # Doctor Who - Narrative Knowledge Graph A rich narrative knowledge graph extracted from *Doctor Who* screenplays using the [Fabula](https://fabula.productions) pipeline. Contains characters, locations, objects, organizations, events, themes, and conflict arcs with full participation semantics and Graph Gravity importance tiers. ## Dataset Overview | Metric | Value | |--------|-------| | Source database | `doctorwho.s04` | | Type | Season database | | Episodes | 43 | | Total nodes | 6,388 | | Total edges | 22,039 | | Schema version | 1.1.0 | | Exported | 2026-04-08 | ### Entity Breakdown | Type | Count | |------|-------| | Act | 117 | | Agent | 308 | | ConflictArc | 180 | | Episode | 43 | | Event | 1,147 | | Location | 373 | | Object | 1,191 | | Organization | 133 | | PlotBeat | 1,960 | | SceneBoundary | 740 | | Theme | 196 | ### Graph Gravity Tiers | Tier | Count | Description | |------|-------|-------------| | anchor | 33 | Main characters / key locations | | planet | 307 | Recurring entities | | asteroid | 1,665 | Minor / one-off entities | ### Relationship Types `AFFILIATED_WITH`, `BELONGS_TO_EPISODE`, `CALLBACK`, `CAUSAL`, `CHARACTER_CONTINUITY`, `CONTAINS_ACT`, `CONTAINS_BEAT`, `CONTAINS_SCENE`, `ESCALATION`, `EXEMPLIFIES_THEME`, `FORESHADOWING`, `INVOLVED_IN_ARC`, `INVOLVED_WITH`, `IN_EVENT`, `NARRATIVELY_FOLLOWS`, `OCCURS_IN`, `PARTICIPATED_AS`, `PART_OF`, `PART_OF_ACT`, `PART_OF_ARC` ... and 4 more ## Related Datasets This is a **single-season** dataset containing entities and events as extracted from Season 4 screenplays. - **Megagraph** (all seasons unified): [brandburner/doctorwho-mega-narrative-kg](https://huggingface.co/datasets/brandburner/doctorwho-mega-narrative-kg) > **Note:** The megagraph is *not* a simple union of season datasets. Cross-season entities are reconciled through a Global Entity Registry (GER), receiving new canonical UUIDs and distilled descriptions. Graph Gravity tiers are recalculated across all episodes. Use individual season datasets for single-season analysis; use the megagraph for cross-season analysis. ## Files | File | Description | |------|-------------| | `nodes.parquet` | All graph nodes with properties | | `edges.parquet` | All relationships with properties | | `positions.parquet` | 3D layout coordinates for visualization | | `meta.json` | Dataset metadata and entity counts | ## Schema ### Nodes (`nodes.parquet`) | Column | Type | Description | |--------|------|-------------| | `node_id` | string | Unique node identifier (UUID) | | `primary_label` | string | Node type (Agent, Location, Event, etc.) | | `name` | string | Display name | | `description` | string | Foundational description | | `tier` | string (nullable) | Graph Gravity tier: anchor / planet / asteroid | | `episode_count` | int (nullable) | Number of distinct episodes entity appears in | | `first_episode_seq` | int (nullable) | First appearance episode | | `last_episode_seq` | int (nullable) | Last appearance episode | | `properties_json` | string | Full node properties as JSON | ### Edges (`edges.parquet`) | Column | Type | Description | |--------|------|-------------| | `source_node_id` | string | Source node UUID | | `target_node_id` | string | Target node UUID | | `relationship_type` | string | Relationship type (e.g., PARTICIPATED_AS) | | `properties_json` | string | Edge properties as JSON | ### Positions (`positions.parquet`) | Column | Type | Description | |--------|------|-------------| | `node_id` | string | Node UUID | | `x`, `y`, `z` | float | 3D coordinates | | `size` | float | Node size (Graph Gravity weighted) | | `r`, `g`, `b` | int | RGB color by entity type | | `community` | int | Louvain community index | | `tier` | string (nullable) | Graph Gravity tier | ## Usage ```python from datasets import load_dataset import pandas as pd # Load from HuggingFace ds = load_dataset("brandburner/doctorwho-s04-narrative-kg") # Or load parquet directly nodes = pd.read_parquet("nodes.parquet") edges = pd.read_parquet("edges.parquet") # Filter to anchor characters anchors = nodes[(nodes['primary_label'] == 'Agent') & (nodes['tier'] == 'anchor')] # Build a NetworkX graph import networkx as nx G = nx.DiGraph() for _, n in nodes.iterrows(): G.add_node(n['node_id'], label=n['primary_label'], name=n['name']) for _, e in edges.iterrows(): G.add_edge(e['source_node_id'], e['target_node_id'], type=e['relationship_type']) ``` ## Citation ```bibtex @misc{fabula_doctorwho_s04, title = {Doctor Who Narrative Knowledge Graph}, author = {Fabula Pipeline}, year = {2026}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/datasets/brandburner/doctorwho-s04-narrative-kg}} } ``` ## License CC BY-SA 4.0
提供机构:
brandburner
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作