five

tokyotech-llm/lmsys-chat-1m-synth

收藏
Hugging Face2026-02-20 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/tokyotech-llm/lmsys-chat-1m-synth
下载链接
链接失效反馈
官方服务:
资源简介:
LMSYS-Chat-1M-Synth-Llama3.1-Ja-and-En是一个包含日语和英语对话的数据集,基于LMSYS-Chat-1M数据集构建,使用了Llama 3.1 405B Instruct模型生成助手的响应。日语部分包括453,889条通过DeepL翻译的用户指令和2,722,314条由Llama 3.1 405B Instruct生成的助手响应。英语部分包括453,737条原始用户指令和相同数量的助手响应。数据集还包括由Llama 3.1 70B Instruct标注的偏好分数。数据集的使用受LMSYS-Chat-1M Dataset License Agreement和LLAMA 3.1 COMMUNITY LICENSE AGREEMENT的约束。

LMSYS-Chat-1M-Synth-Llama3.1-Ja-and-En is a Japanese and English conversation dataset derived from the LMSYS-Chat-1M dataset, with assistant responses generated by the Llama 3.1 405B Instruct model. The Japanese portion includes 453,889 user instructions translated by DeepL and 2,722,314 assistant responses generated by Llama 3.1 405B Instruct. The English portion includes 453,737 original user instructions and the same number of assistant responses. The dataset also includes preference scores annotated by Llama 3.1 70B Instruct. The use of the dataset is governed by the LMSYS-Chat-1M Dataset License Agreement and the LLAMA 3.1 COMMUNITY LICENSE AGREEMENT.
提供机构:
tokyotech-llm
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作