croissantllm/croissant_dataset_no_web_data
收藏Hugging Face2024-02-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/croissantllm/croissant_dataset_no_web_data
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- translation
- text-generation
- text2text-generation
- fill-mask
language:
- fr
- en
size_categories:
- 10B<n<100B
---
# CroissantLLM: A Truly Bilingual French-English Language Model
## Dataset
Ressources are currently being uploaded !
https://arxiv.org/abs/2402.00786
## Licenses
Data redistributed here is subject to the original license under which it was collected. All license information is detailed in the `Data` section of the Technical report.
## Citation
```
@misc{faysse2024croissantllm,
title={CroissantLLM: A Truly Bilingual French-English Language Model},
author={Manuel Faysse and Patrick Fernandes and Nuno M. Guerreiro and António Loison and Duarte M. Alves and Caio Corro and Nicolas Boizard and João Alves and Ricardo Rei and Pedro H. Martins and Antoni Bigata Casademunt and François Yvon and André F. T. Martins and Gautier Viaud and Céline Hudelot and Pierre Colombo},
year={2024},
eprint={2402.00786},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
提供机构:
croissantllm
原始信息汇总
CroissantLLM: A Truly Bilingual French-English Language Model
数据集概述
任务类别
- 翻译
- 文本生成
- 文本到文本生成
- 填充掩码
语言
- 法语
- 英语
数据集大小
- 10B<n<100B
许可证
数据在此处重新分发时,需遵守其原始收集时的许可证。所有许可证信息详见技术报告中的Data部分。
引用
@misc{faysse2024croissantllm, title={CroissantLLM: A Truly Bilingual French-English Language Model}, author={Manuel Faysse and Patrick Fernandes and Nuno M. Guerreiro and António Loison and Duarte M. Alves and Caio Corro and Nicolas Boizard and João Alves and Ricardo Rei and Pedro H. Martins and Antoni Bigata Casademunt and François Yvon and André F. T. Martins and Gautier Viaud and Céline Hudelot and Pierre Colombo}, year={2024}, eprint={2402.00786}, archivePrefix={arXiv}, primaryClass={cs.CL} }



