five

sapienzanlp/it-everyday-conversations-llama3.1-2k-TowerInstruct-Mistral-7B-v0.2

收藏
Hugging Face2024-12-05 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/sapienzanlp/it-everyday-conversations-llama3.1-2k-TowerInstruct-Mistral-7B-v0.2
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: text dtype: string - name: id dtype: int64 - name: prediction dtype: string - name: messages list: - name: content dtype: string - name: role dtype: string splits: - name: train_sft num_bytes: 5501687 num_examples: 2260 - name: test_sft num_bytes: 284899 num_examples: 119 download_size: 2982171 dataset_size: 5786586 configs: - config_name: default data_files: - split: train_sft path: data/train_sft-* - split: test_sft path: data/test_sft-* --- # sapienzanlp/it-everyday-conversations-llama3.1-2k-TowerInstruct-Mistral-7B-v0.2 ## Overview This dataset is the **Italian translation** of the [everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k) dataset, designed specifically for **conversations** in Italian. The translation was carried out using **TowerInstruct-Mistral-7B-v0.2**. - **Languages**: Italian (translated from English) - **Purpose**: Instruction tuning in Italian for conversational AI - **Train Size**: 2260 - **Test Size**: 119 New fields added include: ```python { "messages": list # the translated conversation in Italian, presented in chat template format. "prediction": str # the Italian translation of the original conversation. } ``` ## Usage This dataset is designed to enhance instruction-following capabilities in Italian language models. It is particularly useful for fine-tuning models to follow structured prompts and respond appropriately in Italian. ## Author Alessandro Scirè (scire@babelscape.com)
提供机构:
sapienzanlp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作