five

Llama-3-70b-battles

收藏
魔搭社区2026-01-06 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/lmarena-ai/Llama-3-70b-battles
下载链接
链接失效反馈
官方服务:
资源简介:
**Chatbot Arena user conversations between Llama-3-70b VS GPT-4-1025 or Llama-3-70b VS Claude-3-Opus with user preference votes.** Single turn. Excludes ties. Used in [Llama Data Analysis blog post](https://blog.lmarena.ai/blog/2024/llama3/) and "VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models" ([Paper](https://arxiv.org/abs/2410.12851), [Code](https://github.com/lisadunlap/VibeCheck)). ## Citation ``` @article{dunlap_vibecheck, title={VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models}, author={Lisa Dunlap and Krishna Mandal and Trevor Darrell and Jacob Steinhardt and Joseph E Gonzalez}, journal={arXiv preprint arXiv:2312.02974}, year={2024}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2410.12851}, } ``` ``` @misc{llama3arena2024, title = {What’s up with Llama 3? Arena data analysis}, url = {https://blog.lmarena.ai/blog/2024/llama3/}, author = {Lisa Dunlap, Evan Frick, Tianle Li, Isaac Ong, Joseph E. Gonzalez, Wei-Lin Chiang}, month = {May}, year = {2024} } ``` --- license: apache-2.0 ---

**包含用户偏好投票的聊天机器人竞技场(Chatbot Arena)单轮对话数据集,对战组合为Llama-3-70b对阵GPT-4-1025,或Llama-3-70b对阵Claude-3-Opus,且已剔除平局样本。 本数据集已被应用于《Llama 3近况解析:竞技场数据分析》博客博文([链接](https://blog.lmarena.ai/blog/2024/llama3/))以及论文《VibeCheck:探索并量化大语言模型(Large Language Model)的定性差异》([论文链接](https://arxiv.org/abs/2410.12851)、[代码仓库链接](https://github.com/lisadunlap/VibeCheck))。 ## 引用 @article{dunlap_vibecheck, title={VibeCheck:探索并量化大语言模型的定性差异}, author={Lisa Dunlap and Krishna Mandal and Trevor Darrell and Jacob Steinhardt and Joseph E Gonzalez}, journal={arXiv预印本}, year={2024}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2410.12851}, } @misc{llama3arena2024, title={Llama 3近况解析:竞技场数据分析}, url={https://blog.lmarena.ai/blog/2024/llama3/}, author={Lisa Dunlap, Evan Frick, Tianle Li, Isaac Ong, Joseph E. Gonzalez, Wei-Lin Chiang}, month={May}, year={2024} } --- 许可证:Apache 2.0 ---
提供机构:
maas
创建时间:
2025-04-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作