French 4-grams
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/ClassCat/gpt2-base-french
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含从书籍中提取的n-gram(主要是4-gram),这些n-gram与一个轻量级的法语GPT-2模型计算出的概率相关联。为了确保文本生成的质量,这些n-gram经过了基于概率阈值的筛选。在特定条件下,为5-gram生成了5,052个解决方案,为4-gram生成了56,652个解决方案。该数据集的任务是在约束编程中使用n-gram进行文本生成。
This dataset comprises n-grams (primarily 4-grams) extracted from books, which are associated with probabilities calculated by a lightweight French GPT-2 model. To ensure the quality of text generation, these n-grams were filtered using probability thresholds. Under specific conditions, 5,052 solutions were generated for 5-grams and 56,652 for 4-grams. The core task of this dataset is to enable text generation via n-grams in constraint programming.
提供机构:
Hugging Face



