five

maharshipandya/synthanime-openhermes2.5

收藏
Hugging Face2024-03-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/maharshipandya/synthanime-openhermes2.5
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text2text-generation - text-generation pretty_name: Instruction-Response anime synopsis data size_categories: - 10K<n<100K --- # What is this dataset? This is a hybrid **Instruction-Response** dataset (scraped + synthetic) for anime synopses. Given around 10,000 scraped anime synopses, the user instructions for this dataset were generated using [Teknium's Openhermes 2.5](teknium/OpenHermes-2.5-Mistral-7B) - The `assistant` column consists of synopsis for different animes (which were previously scraped) - The `user` column consists of the instructions that "might have" generated the synopsis (synthetically generated) The goal of this dataset is: to be used as a part of a much larger dataset in order to fine tune open-source LLMs to follow instructions better and also to have some more knowledge about anime. **Using open-source LLMs to make open-source LLMs better** 🫶
提供机构:
maharshipandya
原始信息汇总

数据集概述

数据集名称

Instruction-Response anime synopsis data

许可证

apache-2.0

任务类别

  • 文本到文本生成
  • 文本生成

数据集大小

10K<n<100K

数据集描述

这是一个混合型的指令-响应数据集(爬取+合成),用于动漫剧情简介。该数据集包含约10,000条爬取的动漫剧情简介,用户指令是通过Tekniums Openhermes 2.5生成的。

  • assistant列包含不同动漫的剧情简介(之前爬取的)
  • user列包含可能生成这些剧情简介的指令(合成生成)

数据集目标

该数据集旨在作为更大数据集的一部分,用于微调开源大型语言模型(LLMs),以更好地遵循指令并增加对动漫的了解。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作