five

croissantllm/croissant_dataset_no_web_data

收藏
Hugging Face2024-02-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/croissantllm/croissant_dataset_no_web_data
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - translation - text-generation - text2text-generation - fill-mask language: - fr - en size_categories: - 10B<n<100B --- # CroissantLLM: A Truly Bilingual French-English Language Model ## Dataset Ressources are currently being uploaded ! https://arxiv.org/abs/2402.00786 ## Licenses Data redistributed here is subject to the original license under which it was collected. All license information is detailed in the `Data` section of the Technical report. ## Citation ``` @misc{faysse2024croissantllm, title={CroissantLLM: A Truly Bilingual French-English Language Model}, author={Manuel Faysse and Patrick Fernandes and Nuno M. Guerreiro and António Loison and Duarte M. Alves and Caio Corro and Nicolas Boizard and João Alves and Ricardo Rei and Pedro H. Martins and Antoni Bigata Casademunt and François Yvon and André F. T. Martins and Gautier Viaud and Céline Hudelot and Pierre Colombo}, year={2024}, eprint={2402.00786}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```
提供机构:
croissantllm
原始信息汇总

CroissantLLM: A Truly Bilingual French-English Language Model

数据集概述

任务类别

  • 翻译
  • 文本生成
  • 文本到文本生成
  • 填充掩码

语言

  • 法语
  • 英语

数据集大小

  • 10B<n<100B

许可证

数据在此处重新分发时,需遵守其原始收集时的许可证。所有许可证信息详见技术报告中的Data部分。

引用

@misc{faysse2024croissantllm, title={CroissantLLM: A Truly Bilingual French-English Language Model}, author={Manuel Faysse and Patrick Fernandes and Nuno M. Guerreiro and António Loison and Duarte M. Alves and Caio Corro and Nicolas Boizard and João Alves and Ricardo Rei and Pedro H. Martins and Antoni Bigata Casademunt and François Yvon and André F. T. Martins and Gautier Viaud and Céline Hudelot and Pierre Colombo}, year={2024}, eprint={2402.00786}, archivePrefix={arXiv}, primaryClass={cs.CL} }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作