five

French 4-grams

收藏
arXiv2025-09-30 收录
下载链接:
https://huggingface.co/ClassCat/gpt2-base-french
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含从书籍中提取的n-gram(主要是4-gram),这些n-gram与一个轻量级的法语GPT-2模型计算出的概率相关联。为了确保文本生成的质量,这些n-gram经过了基于概率阈值的筛选。在特定条件下,为5-gram生成了5,052个解决方案,为4-gram生成了56,652个解决方案。该数据集的任务是在约束编程中使用n-gram进行文本生成。

This dataset comprises n-grams (primarily 4-grams) extracted from books, which are associated with probabilities calculated by a lightweight French GPT-2 model. To ensure the quality of text generation, these n-grams were filtered using probability thresholds. Under specific conditions, 5,052 solutions were generated for 5-grams and 56,652 for 4-grams. The core task of this dataset is to enable text generation via n-grams in constraint programming.
提供机构:
Hugging Face
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作