five

v3xlrm1nOwo1/KaidanNihonbunka

收藏
Hugging Face2024-04-15 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/v3xlrm1nOwo1/KaidanNihonbunka
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation - text2text-generation language: - ja tags: - art - folklore - Hyakumonogatari - Nihonbunka pretty_name: 'Kaidan Nihonbunka: A Journey Through Hyakumonogatari''s Ghostly Tales' size_categories: - 1K<n<10K --- # *Kaidan Nihonbunka: A Journey Through Hyakumonogatari's Ghostly Tales* > Welcome to the Kaidan Nihonbunka Dataset <div align="center"> <picture> <source srcset="https://cdn-uploads.huggingface.co/production/uploads/64af7c627ab7586520ed8688/VbXOBJgHwWFvJHsXTyBUQ.jpeg" media="(prefers-color-scheme: dark)" /> <source srcset="https://cdn-uploads.huggingface.co/production/uploads/64af7c627ab7586520ed8688/VbXOBJgHwWFvJHsXTyBUQ.jpeg" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" /> <img src="https://cdn-uploads.huggingface.co/production/uploads/64af7c627ab7586520ed8688/VbXOBJgHwWFvJHsXTyBUQ.jpeg" width="100%" height="350px" /> </picture> </div> ## About Name `kaidan Nihonbunka` translates to `怪談日本文化` in Japanese: - `怪談 (Kwaidan)`: Ghost story or supernatural tale. - `日本文化 (Nihonbunka)`: Japanese culture. So, the translated name would be `怪談日本文化`. ## Overview The `kaidan Nihonbunka` Dataset is a collection of Japanese folklore of ghost stories, also known as "kaidan", associated with the traditional Japanese ritual of Hyakumonogatari. This dataset contains approximately 8000 rows of ghost stories, including their old names, new names generated by GPT-4, the text content of the stories, and URLs for additional information or sources. You find code of this dataset in my Gihub account <a href="https://github.com/v3xlrm1nOwo1/KaidanNihonbunka">v3xlrm1nOwo1</a>. ## Data Format ### The dataset is provided in two formats `Parquet` and `Pickle`: These formats and fields provide flexibility for different use cases, allowing researchers and data scientists to work with the dataset using their preferred tools and programming languages. 1. **Parquet File**: Contains structured data in a columnar format, suitable for data analysis and processing with tools like Apache Spark. 2. **Pickle File**: Contains a serialized Python object, allowing for easy loading and manipulation of the dataset in Python environments. ### Dataset Fields Each entry in the dataset is represented by a row with the following fields: | Field | Description | |----------|-------------------------------------------------------------------------------------------------------------| | `Old Name` | The old name or previous designation of the ghost story. | | `New Name` | Generated by GPT-4, this column contains the new name or a modernized version of the ghost story's title. | | `Kaidan` | The text or content of the ghost story. | | `URL` | Contains URLs related to the ghost story, such as links to additional information or sources. | ## Usage Researchers, data scientists, and enthusiasts interested in Japanese folklore, ghost stories, or cultural rituals like Hyakumonogatari can utilize this dataset for various purposes, including: - Analyzing themes and patterns in ghost stories. - Building machine learning models for story generation or classification. - Exploring connections between traditional rituals and storytelling. ```py import datasets # Load the dataset dataset = datasets.load_dataset('v3xlrm1nOwo1/KaidanNihonbunka') print(dataset) ``` ```py DatasetDict({ train: Dataset({ features: ['old name', 'new name', 'kaidan', 'url'], num_rows: 8559 }) }) ``` ## Acknowledgments We would like to acknowledge the creators of the original ghost stories and the individuals or sources that contributed to compiling this dataset. Without their efforts, this collection would not be possible. ## License This dataset is distributed under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0), allowing for flexible usage and modification while ensuring proper attribution and adherence to copyright laws. > **_NOTE:_** To contribute to the project, please contribute directly. I am happy to do so, and if you have any comments, advice, job opportunities, or want me to contribute to a project, please contact me I am happy to do so <a href='mailto:v3xlrm1nOwo1@gmail.com' target='blank'>v3xlrm1nOwo1@gmail.com</a>
提供机构:
v3xlrm1nOwo1
原始信息汇总

Kaidan Nihonbunka: A Journey Through Hyakumonogataris Ghostly Tales

数据集概述

kaidan Nihonbunka 数据集是一个日本鬼怪故事的集合,与传统的日本仪式 Hyakumonogatari 相关。该数据集包含约 8000 条鬼怪故事,包括它们的旧名称、由 GPT-4 生成的新名称、故事文本内容以及相关信息的 URL。

数据格式

数据集提供两种格式:ParquetPickle

  1. Parquet 文件:以列式格式存储结构化数据,适合使用 Apache Spark 等工具进行数据分析和处理。
  2. Pickle 文件:包含序列化的 Python 对象,便于在 Python 环境中加载和操作数据集。

数据字段

每个数据条目包含以下字段:

字段 描述
Old Name 鬼怪故事的旧名称或先前指定。
New Name 由 GPT-4 生成的新名称或鬼怪故事标题的现代化版本。
Kaidan 鬼怪故事的文本或内容。
URL 与鬼怪故事相关的 URL,如附加信息或来源链接。

使用场景

该数据集适用于对日本民俗、鬼怪故事或 Hyakumonogatari 等文化仪式感兴趣的研究人员、数据科学家和爱好者,可用于以下目的:

  • 分析鬼怪故事中的主题和模式。
  • 构建故事生成或分类的机器学习模型。
  • 探索传统仪式与讲故事之间的联系。

许可证

该数据集遵循 Apache License 2.0 许可证,允许灵活使用和修改,同时确保适当的归属和遵守版权法。

搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作