ConseggioLigure/lijnews-instruct-lij-ita
收藏数据集概述
数据集名称
LigurianNews lij-ita translation dataset (instruction-style)
许可协议
cc-by-4.0
任务类别
- 对话
- 翻译
数据集信息
特征
- inputs: 输入文本,数据类型为字符串
- targets: 目标文本,数据类型为字符串
- template_id: 模板ID,数据类型为int64
- template_lang: 模板语言,数据类型为字符串序列
数据分割
- train: 训练集,包含288462字节,153个样本
- dev: 验证集,包含47500字节,27个样本
- test: 测试集,包含67307字节,36个样本
数据大小
- 下载大小: 292727字节
- 数据集大小: 403269字节
配置
- config_name: default
- data_files:
- train: data/train-*
- dev: data/dev-*
- test: data/test-*
- data_files:
数据集描述
该数据集是一个利古里亚语到意大利语的文档级翻译数据集。原始数据来自LigurianNews corpus,并已转换为指令格式。
提示模板
提示以利古里亚语编写,要求模型“将以下文本翻译成意大利语”。有多个提示变体,每个句子随机抽样:
Traduxi in italian: <sentence> Traduxi da-o zeneise à l’italian: <sentence> Traduxi da-o ligure à l’italian: <sentence> Traduxi sto testo in italian: <sentence> Traduxi in lengua italiaña: <sentence> Traduxi sto testo da-o zeneise à l’italian: <sentence> Traduxi sto testo da-o ligure à l’italian: <sentence> Comm’à l’é a traduçion italiaña de sto testo? <sentence> Quæ a l’é a traduçion italiaña de sto testo? <sentence> Ti peu tradue sto testo in italian? <sentence>
每个数据集条目使用的提示模板在template_id列中引用,ID范围从1到10。
目标模板
目标文本始终以字符串_"A traduçion in italian do testo a l’é: <sentence>"_(“文本的意大利语翻译是:”)为前缀。
模板对应关系
[ (1, "Traduxi in italian: ", "A traduçion in italian do testo a l’é: "), (2, "Traduxi da-o zeneise à l’italian: ", "A traduçion in italian do testo a l’é: "), (3, "Traduxi da-o ligure à l’italian: ", "A traduçion in italian do testo a l’é: "), (4, "Traduxi sto testo in italian: ", "A traduçion in italian do testo a l’é: "), (5, "Traduxi in lengua italiaña: ", "A traduçion in italian do testo a l’é: "), (6, "Traduxi sto testo da-o zeneise à l’italian: ", "A traduçion in italian do testo a l’é: "), (7, "Traduxi sto testo da-o ligure à l’italian: ", "A traduçion in italian do testo a l’é: "), (8, "Comm’à l’é a traduçion italiaña de sto testo? ", "A traduçion in italian do testo a l’é: "), (9, "Quæ a l’é a traduçion italiaña de sto testo? ", "A traduçion in italian do testo a l’é: "), (10, "Ti peu tradue sto testo in italian? ", "A traduçion in italian do testo a l’é: "), ]
数据集样本数量
- 训练样本: 153
- 验证样本: 27
- 测试样本: 36



