Bonito
收藏arXiv2024-02-28 更新2024-06-21 收录
下载链接:
https://github.com/BatsResearch/bonito
下载链接
链接失效反馈官方服务:
资源简介:
Bonito是由布朗大学计算机科学系创建的一个大型数据集,包含165万个示例,用于训练条件任务生成模型。该数据集通过重新组合现有指导调整数据集生成元模板,为输入未注释文本和任务属性,输出指导和响应的训练示例。Bonito数据集主要用于改进大型语言模型在零样本任务适应中的性能,特别是在用户专业私有数据上的应用。通过使用Bonito生成的合成任务,模型能够更好地理解和执行特定领域的任务,如医学和法律领域,从而将大型语言模型的优势带给更广泛的用户群体。
Bonito is a large-scale dataset created by the Department of Computer Science at Brown University, containing 1.65 million examples for training conditional task generation models. It generates meta-templates by recombining existing instruction-tuning datasets to create training instances that take unannotated text and task attributes as inputs, and output instructions and corresponding responses as outputs. The Bonito dataset is primarily used to improve the performance of large language models (LLMs) in zero-shot task adaptation, especially for applications on professional private user data. By leveraging synthetic tasks generated via Bonito, models can better understand and execute domain-specific tasks such as those in medical and legal fields, thereby bringing the strengths of large language models to a broader user base.
提供机构:
布朗大学计算机科学系
创建时间:
2024-02-28



