Romulus-cpt-fr

Name: Romulus-cpt-fr
Creator: maas
Published: 2025-12-05 16:54:35
License: 暂无描述

魔搭社区2025-12-05 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/louisbrulenaudet/Romulus-cpt-fr

下载链接

链接失效反馈

官方服务：

资源简介：

<img src="assets/thumbnail.webp"> # Romulus, continually pre-trained models for French law. Romulus is a series of continually pre-trained models enriched in French law and intended to serve as the basis for a fine-tuning process on labeled data. Please note that these models have not been aligned for the production of usable text as they stand, and will certainly need to be fine-tuned for the desired tasks in order to produce satisfactory results. The training corpus is made up of around 34,864,949 tokens (calculated with the meta-llama/Meta-Llama-3.1-8B tokenizer). ## Citing & Authors If you use this code in your research, please use the following BibTeX entry. ```BibTeX @misc{louisbrulenaudet2024, author = {Louis Brulé Naudet}, title = {Romulus, continually pre-trained models for French law}, year = {2024} howpublished = {\url{https://huggingface.co/datasets/louisbrulenaudet/Romulus-cpt-fr}}, } ``` ## Feedback If you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).

<img src="assets/thumbnail.webp"> # Romulus：面向法国法律的持续预训练模型系列 Romulus是一系列针对法国法律领域进行知识增强的持续预训练模型，旨在作为标注数据上微调流程的基础模型。请注意，当前这些模型尚未经过对齐以生成可直接使用的文本，若要获得符合预期的理想结果，无疑需要针对特定任务进行微调。训练语料库包含约34,864,949个Token（使用meta-llama/Meta-Llama-3.1-8B分词器计算得到）。 ## 引用与作者若您在研究中使用此代码，请使用以下BibTeX条目： BibTeX @misc{louisbrulenaudet2024, author = {Louis Brulé Naudet}, title = {Romulus, continually pre-trained models for French law}, year = {2024} howpublished = {url{https://huggingface.co/datasets/louisbrulenaudet/Romulus-cpt-fr}}, } ## 反馈若您有任何反馈，请通过[louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com)联系。

提供机构：

maas

创建时间：

2025-10-13

5,000+

优质数据集

54 个

任务类型

进入经典数据集