X-SRL Dataset and mBERT Word Aligner

Name: X-SRL Dataset and mBERT Word Aligner
Creator: heiDATA
Published: 2025-01-28 12:50:14
License: 暂无描述

DataCite Commons2025-01-28 更新2025-04-17 收录

下载链接：

https://heidata.uni-heidelberg.de/citation?persistentId=doi:10.11588/DATA/HVXXIJ

下载链接

链接失效反馈

官方服务：

资源简介：

This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages.

提供机构：

heiDATA

创建时间：

2021-01-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集