Edgerunners/Phoebus-86k
收藏Hugging Face2024-05-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Edgerunners/Phoebus-86k
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
---
raw human created erotica stories
very aggresively cleaned version of Phoebus-127k to try to remove:
1. product spambots, the website was being spammed a few times (usually with html tags)
2. warning only pages ("Warning" and nothing else)
3. edits (authors adding editorial history footnotes)
4. patreon and alike callouts (author asking for donations)
5. author notes, summaries, tagging
some still got through, classifier used: Edgerunners/Phoebus-Spam-Classifier-v2
---
1. The Dataset is provided ""AS IS"" and ""AS AVAILABLE"" without warranty of any kind, express or implied, including but not limited to warranties of merchantability, fitness for a particular purpose, title, or non-infringement.
2. The Provider disclaims all liability for any damages or losses resulting from the use or misuse of the Dataset, including but not limited to any damages or losses arising from the use of the Dataset for purposes other than those intended by the Provider.
3. The Provider does not endorse or condone the use of the Dataset for any purpose that violates applicable laws, regulations, or ethical standards.
4. The Provider does not warrant that the Dataset will meet your specific requirements or that it will be error-free or that it will function without interruption.
5. You assume all risks associated with the use of the Dataset, including but not limited to any loss of data, loss of business, or damage to your reputation.
提供机构:
Edgerunners
原始信息汇总
数据集概述
数据集内容
- 类型: 原始人类创作的情色故事
- 处理: 经过积极清理的Phoebus-127k版本,主要移除了以下内容:
- 产品垃圾邮件(通常包含HTML标签)
- 仅含警告的页面
- 编辑历史脚注
- Patreon等捐赠请求
- 作者笔记、摘要、标签
数据集清理
- 清理工具: 使用Edgerunners/Phoebus-Spam-Classifier-v2分类器进行清理
许可证
- 许可证: CC-BY-NC-4.0
免责声明
- 数据集提供“按现状”和“按可用性”提供,不提供任何明示或暗示的保证。
- 提供者不承担因使用或滥用数据集导致的任何损害或损失的责任。
- 提供者不支持或赞同违反适用法律、法规或道德标准的数据集使用。
- 提供者不保证数据集将满足您的特定要求,或数据集将无错误或无中断运行。
- 用户承担使用数据集的所有风险,包括但不限于数据丢失、业务损失或声誉损害。



