Semantic Text Analyser BERT-like language model for formal language understanding
收藏DataCite Commons2026-03-10 更新2026-05-04 收录
下载链接:
https://data.jrc.ec.europa.eu/dataset/addd10f9-8325-4e49-8588-6cb681c162a5
下载链接
链接失效反馈官方服务:
资源简介:
SeTABERTa is a new multilingual langue model pertained from scratch using various Open Access text repositories: EU legislation, research articles, EU public documents and US patents. 2/3 of training data is English. The other part of data covers EU24 languages. The model was trained on JRC Big Data Platform. The model can be fine-tuned for other tasks.
The model is available on HuggingFace at https://huggingface.co/vidaud/SeTABERTa-mlm-v1 and can be loaded with HuggingFace transformers library.
提供机构:
European Commission, Joint Research Centre
创建时间:
2026-03-10



