five

jjrussell10/storyscope

收藏
Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/jjrussell10/storyscope
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: StoryScope task_categories: - text-classification language: - en license: mit size_categories: - 10K<n<100K configs: - config_name: default data_files: - split: train path: stories_train.parquet - split: validation path: stories_val.parquet - split: test path: stories_test.parquet - split: dev path: stories_dev.parquet --- # StoryScope - `stories_train.parquet`, `stories_val.parquet`, `stories_test.parquet`, `stories_dev.parquet`: prompt metadata plus AI-generated stories from GPT-5.4, Claude Sonnet 4.6, DeepSeek V3.2, Kimi K2.5, and Gemini 3 Flash - `storyscope_features.parquet`: 304 extracted narrative features for 61,575 story rows - `taxonomy.json`: the 304-feature taxonomy spanning 10 narrative dimensions - `models/`: trained XGBoost classifiers for binary human-vs-AI detection and 6-way authorship attribution ## Notes - Human story text is excluded for copyright reasons. ## Story Split Schema Columns: - `prompt_id` - `split` - `title` - `prompt` - `human_author` - `human_anthology` - `human_word_count` - `story_gpt` - `story_deepseek` - `story_kimi` - `story_gemini` - `story_claude` Split sizes: - train: 7,383 prompts - validation: 1,405 prompts - test: 1,384 prompts - dev: 100 prompts ## Feature File Schema `storyscope_features.parquet` contains: - `prompt_id` - `story_title` - `source` - 304 feature columns such as `REV_SUS_001`, `PER_POV_001`, and `SOC_REL_024` Feature values are encoded as strings for categorical, ordinal, binary, and multi-select outputs, with scale values stored as numeric-style entries.
提供机构:
jjrussell10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作