five

andreabac3/MedQuaAD-Italian-Fauno-Baize

收藏
Hugging Face2023-04-08 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/andreabac3/MedQuaAD-Italian-Fauno-Baize
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: gpl-3.0 --- # MedQuaAD-Italian-Fauno-Baize This dataset is an Italian translation of the MedQuaAD dataset presented by Baize's authors. ## Dataset Description - **Paper:** https://arxiv.org/abs/2304.01196 ### Languages Italian ## Dataset Structure ### Data Instances Sentences 46,867 average number of turns 3.8 response lengths of each turn 35.8 ### Data Fields topic, input ### Data Splits Train ## Dataset Creation ### Source Data #### Initial Data Collection and Normalization https://github.com/project-baize/baize-chatbot ## Additional Information ### Dataset Curators [Andrea Bacciu](https://andreabac3.github.io/), Dr. [Giovanni Trappolini](https://sites.google.com/view/giovannitrappolini), [Andrea Santilli](https://www.santilli.xyz/), and Professor [Fabrizio Silvestri](https://sites.google.com/diag.uniroma1.it/fabriziosilvestri/home). ### Licensing Information This project is a derivative of Baize, and we adhere to the licensing constraints imposed by Baize's creators. ### Citation Information ```bibtex @misc{fauno, author = {Andrea Bacciu, Giovanni Trappolini, Andrea Santilli, Fabrizio Silvestri}, title = {Fauno: The Italian Large Language Model that will leave you senza parole!}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/andreabac3/Fauno-Italian-LLM}}, } ``` ```bibtex @article{xu2023baize, title={Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data}, author={Xu, Canwen and Guo, Daya and Duan, Nan and McAuley, Julian}, journal={arXiv preprint arXiv:2304.01196}, year={2023} } ```
提供机构:
andreabac3
原始信息汇总

MedQuaAD-Italian-Fauno-Baize 数据集概述

数据集描述

  • 语言: 意大利语
  • 数据结构:
    • 数据实例: 共46,867个句子
    • 平均对话轮数: 3.8轮
    • 每轮响应长度: 35.8字
  • 数据字段:
    • 主题 (topic)
    • 输入 (input)
  • 数据分割:
    • 训练集 (Train)

数据集创建

数据集版权信息

  • 许可证: GPL-3.0
  • 版权声明: 本项目是Baize的衍生作品,遵循Baize创作者的许可约束。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作