Poetry-Foundation-Poems
收藏魔搭社区2025-11-27 更新2025-05-17 收录
下载链接:
https://modelscope.cn/datasets/suayptalha/Poetry-Foundation-Poems
下载链接
链接失效反馈官方服务:
资源简介:
From: https://www.kaggle.com/datasets/tgdivy/poetry-foundation-poems
**Poetry Foundation Poems Dataset**
**Overview**
This dataset contains a collection of 13.9k poems sourced from the Poetry Foundation website. Each poem entry includes its title, author, and associated tags (if available). The dataset provides a robust resource for exploring poetry, analyzing thematic trends, or creating applications such as poem generators.
**Dataset Structure**
The dataset consists of the following columns:
1. Title: The title of the poem.
2. Author: The name of the poem’s author.
3. Tags: The thematic tags or categories associated with the poems.
Dataset Highlights
• Size: The dataset includes 13.9k rows, with each row representing an individual poem.
• Diversity: Poems span a wide range of topics and authors, making it a rich resource for literary and thematic exploration.
• Tags: The tags provide a structured way to categorize and filter poems by themes, enhancing the dataset’s usability for research and creative projects.
**Use Cases**
1. Poem Generation:
Train models to generate poems based on user-inputted topics or tags.
2. Thematic and Sentiment Analysis:
Analyze trends in poetic themes, sentiments, or styles over time.
3. NLP Tasks:
Use the dataset for text classification, clustering, or other natural language processing tasks.
4. Educational Resources:
Develop tools or applications for poetry analysis, learning, or teaching.
5. Visualizations:
Create word clouds or charts using the tags to identify common themes in poetry.
**Technical Details**
• File Size: Approximately 13,900 rows of data.
• Format: Typically provided in CSV or JSON format.
• Dependencies:
• Pandas for data manipulation.
• NLTK or spaCy for natural language processing.
• Matplotlib or WordCloud for creating visualizations.
**Licensing**
This dataset is under **GNU Affero General Public License v3.0**.
**Acknowledgments**
The dataset was compiled to provide researchers, developers, and enthusiasts with a structured collection of poetry for creative and analytical purposes. All credits go to the original authors and the Poetry Foundation for their work in making these poems accessible.
<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
数据来源:https://www.kaggle.com/datasets/tgdivy/poetry-foundation-poems
**诗歌基金会诗歌数据集(Poetry Foundation Poems Dataset)**
**概览**
本数据集收录了源自诗歌基金会官网的13900首诗歌。每条诗歌条目包含标题、作者以及相关标签(如有提供)。本数据集为诗歌探索、主题趋势分析或开发诗歌生成器等应用提供了可靠的优质资源。
**数据集结构**
本数据集包含以下列:
1. Title(标题):诗歌的标题。
2. Author(作者):诗歌作者的姓名。
3. Tags(标签):与诗歌相关的主题标签或分类类别。
**数据集亮点**
• 规模:本数据集包含13900条数据记录,每条记录对应一首独立诗歌。
• 多样性:收录的诗歌涵盖广泛的主题与作者群体,是开展文学与主题探索的丰富资源。
• 标签体系:标签提供了结构化的分类与筛选方式,可按主题对诗歌进行归类与过滤,有效提升了本数据集在研究与创意项目中的可用性。
**应用场景**
1. 诗歌生成:
训练模型基于用户输入的主题或标签生成诗歌。
2. 主题与情感分析:
分析不同时期诗歌的主题、情感或风格趋势。
3. 自然语言处理(Natural Language Processing,简称NLP)任务:
将本数据集用于文本分类、聚类或其他自然语言处理任务。
4. 教育资源:
开发用于诗歌分析、学习或教学的工具与应用程序。
5. 可视化创作:
利用标签生成词云或图表,以识别诗歌中的常见主题。
**技术细节**
• 数据规模:约13900条数据记录。
• 文件格式:通常以CSV或JSON格式提供。
• 依赖工具:
• Pandas:用于数据处理与操作。
• NLTK或spaCy:用于自然语言处理任务。
• Matplotlib或WordCloud:用于生成可视化内容。
**授权协议**
本数据集采用**GNU Affero通用公共许可证v3.0(GNU Affero General Public License v3.0)**。
**致谢**
本数据集的编制旨在为研究人员、开发者与诗歌爱好者提供结构化的诗歌合集,用于创意与分析用途。所有荣誉归于原作者及诗歌基金会,感谢他们将这些诗歌公开传播,使其得以被广泛获取。
<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
提供机构:
maas
创建时间:
2025-05-16



