Romulus-cpt-fr
收藏魔搭社区2025-12-05 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/louisbrulenaudet/Romulus-cpt-fr
下载链接
链接失效反馈官方服务:
资源简介:
<img src="assets/thumbnail.webp">
# Romulus, continually pre-trained models for French law.
Romulus is a series of continually pre-trained models enriched in French law and intended to serve as the basis for a fine-tuning process on labeled data. Please note that these models have not been aligned for the production of usable text as they stand, and will certainly need to be fine-tuned for the desired tasks in order to produce satisfactory results.
The training corpus is made up of around 34,864,949 tokens (calculated with the meta-llama/Meta-Llama-3.1-8B tokenizer).
## Citing & Authors
If you use this code in your research, please use the following BibTeX entry.
```BibTeX
@misc{louisbrulenaudet2024,
author = {Louis Brulé Naudet},
title = {Romulus, continually pre-trained models for French law},
year = {2024}
howpublished = {\url{https://huggingface.co/datasets/louisbrulenaudet/Romulus-cpt-fr}},
}
```
## Feedback
If you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).
<img src="assets/thumbnail.webp">
# Romulus:面向法国法律的持续预训练模型系列
Romulus是一系列针对法国法律领域进行知识增强的持续预训练模型,旨在作为标注数据上微调流程的基础模型。请注意,当前这些模型尚未经过对齐以生成可直接使用的文本,若要获得符合预期的理想结果,无疑需要针对特定任务进行微调。
训练语料库包含约34,864,949个Token(使用meta-llama/Meta-Llama-3.1-8B分词器计算得到)。
## 引用与作者
若您在研究中使用此代码,请使用以下BibTeX条目:
BibTeX
@misc{louisbrulenaudet2024,
author = {Louis Brulé Naudet},
title = {Romulus, continually pre-trained models for French law},
year = {2024}
howpublished = {url{https://huggingface.co/datasets/louisbrulenaudet/Romulus-cpt-fr}},
}
## 反馈
若您有任何反馈,请通过[louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com)联系。
提供机构:
maas
创建时间:
2025-10-13



