maharshipandya/synthanime-openhermes2.5
收藏Hugging Face2024-03-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/maharshipandya/synthanime-openhermes2.5
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text2text-generation
- text-generation
pretty_name: Instruction-Response anime synopsis data
size_categories:
- 10K<n<100K
---
# What is this dataset?
This is a hybrid **Instruction-Response** dataset (scraped + synthetic) for anime synopses.
Given around 10,000 scraped anime synopses, the user instructions for this dataset were generated using [Teknium's Openhermes 2.5](teknium/OpenHermes-2.5-Mistral-7B)
- The `assistant` column consists of synopsis for different animes (which were previously scraped)
- The `user` column consists of the instructions that "might have" generated the synopsis (synthetically generated)
The goal of this dataset is: to be used as a part of a much larger dataset in order to fine tune open-source LLMs to follow instructions better and also to have some more knowledge about anime.
**Using open-source LLMs to make open-source LLMs better** 🫶
提供机构:
maharshipandya
原始信息汇总
数据集概述
数据集名称
Instruction-Response anime synopsis data
许可证
apache-2.0
任务类别
- 文本到文本生成
- 文本生成
数据集大小
10K<n<100K
数据集描述
这是一个混合型的指令-响应数据集(爬取+合成),用于动漫剧情简介。该数据集包含约10,000条爬取的动漫剧情简介,用户指令是通过Tekniums Openhermes 2.5生成的。
assistant列包含不同动漫的剧情简介(之前爬取的)user列包含可能生成这些剧情简介的指令(合成生成)
数据集目标
该数据集旨在作为更大数据集的一部分,用于微调开源大型语言模型(LLMs),以更好地遵循指令并增加对动漫的了解。



