five

OpenHermes-2.5-zh

收藏
魔搭社区2026-05-23 更新2024-06-08 收录
下载链接:
https://modelscope.cn/datasets/swift/OpenHermes-2.5-zh
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for OpenHermes-2.5-zh This is a partial Chinese translation of the [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) dataset as well as [glaiveai/glaive-function-calling](https://huggingface.co/datasets/glaiveai/glaive-function-calling). Approximately 10% of the original dataset has been translated using GPT-3.5, and low-quality translations have been filtered out. OpenHermes is a diverse and high-quality instruction tuning dataset that primarily contains samples generated with GPT-4. This Chinese version can serve as a complement for fine-tuning LLM models to help them handle Chinese instructions better. 这是 [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) 数据集以及 [glaiveai/glaive-function-calling](https://huggingface.co/datasets/glaiveai/glaive-function-calling) 的部分中文翻译。我用 GPT-3.5 翻译了原数据大约 10% 的样本并过滤掉了低质量的翻译。 OpenHermes 是一个多样化高质量的指令微调数据集,主要包含由 GPT-4 生成的样本。这个中文版本可以作为微调中文LLM的补充。 ## Data Structure The dataset contains 91506 samples, each of which has the same structure as OpenHermes-2.5. Only fields in conversations are translated, and other fields are kept the same as the original dataset. The following is an example of a sample in the dataset: ```json { "system_prompt": str, "id": str, "origin_idx": int, // the orginal index of the sample in the OpenHermes-2.5 "model_name": null, "avatarUrl": null, "topic": null, "custom_instruction": null, "views": null, "hash": null, "idx": null, "source": "glaiveai/glaive-function-calling-v2", // from which split of the OpenHermes-2.5 the sample comes "conversations": [ { "from": "system", "value": "您是一个乐于助人的助手...", "weight": null }, { "from": "human", "value": "使用Python编程语言编写一个函数...", "weight": null }, { "from": "gpt", "value": "这是用于函数的Python代码...", "weight": null }, //... ], "title": null, "category": null, "skip_prompt_formatting": null, "model": null, "language": null } ``` ## Citation ```bibtex @misc{OpenHermes 2.5-zh, title = {OpenHermes 2.5-zh: A partial Chinese translation of OpenHermes-2.5}, author = {Wenbo Pan}, year = {2024}, publisher = {HuggingFace}, url = {https://huggingface.co/datasets/wenbopan/OpenHermes-2.5-zh} } ```

# 数据集卡片:OpenHermes-2.5-zh 本数据集为[OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5)与[glaiveai/glaive-function-calling](https://huggingface.co/datasets/glaiveai/glaive-function-calling)数据集的部分中文译本。我们通过GPT-3.5翻译了原数据集约10%的样本,并过滤了低质量的翻译结果。 OpenHermes是一个兼具多样性与高质量的指令微调数据集,其样本主要由GPT-4生成。本中文译本可作为大语言模型(LLM)微调的补充数据集,助力模型更好地处理中文指令。 ## 数据集结构 本数据集共包含91506条样本,每条样本的结构与OpenHermes-2.5保持一致。仅对话(conversations)字段的内容被译为中文,其余字段均保留原始数据集格式。以下为本数据集的一条样本示例: json { "system_prompt": str, "id": str, "origin_idx": int, // 该样本在OpenHermes-2.5中的原始索引 "model_name": null, "avatarUrl": null, "topic": null, "custom_instruction": null, "views": null, "hash": null, "idx": null, "source": "glaiveai/glaive-function-calling-v2", // 该样本所属的OpenHermes-2.5数据集分支 "conversations": [ { "from": "system", "value": "您是一个乐于助人的助手...", "weight": null }, { "from": "human", "value": "使用Python编程语言编写一个函数...", "weight": null }, { "from": "gpt", "value": "这是用于函数的Python代码...", "weight": null }, //... ], "title": null, "category": null, "skip_prompt_formatting": null, "model": null, "language": null } ## 引用 bibtex @misc{OpenHermes 2.5-zh, title = {OpenHermes 2.5-zh:OpenHermes-2.5数据集的部分中文译本}, author = {潘文博 (Wenbo Pan)}, year = {2024}, publisher = {HuggingFace}, url = {https://huggingface.co/datasets/wenbopan/OpenHermes-2.5-zh} }
提供机构:
maas
创建时间:
2024-06-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作