five

lukasplevac/alpaca-cs

收藏
Hugging Face2025-12-14 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/lukasplevac/alpaca-cs
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是流行Alpaca数据集的捷克语版本,基于yahma/alpaca-cleaned数据集创建。包含翻译成捷克语的指令、输入和输出,适合用于训练和评估LLM模型。数据格式包括instruction(指令文本,描述模型应执行的任务)、input(可选的指令输入文本)和output(模型的预期输出)。数据集大小约为50k行,语言为捷克语。翻译通过LLM(gemma3:12b)自动完成,并进行了手动质量检查。数据集适用于捷克语LLM模型的实验和指令调优。

This dataset is a Czech version of the popular Alpaca dataset, originally based on [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned). It contains **instructions, inputs, and outputs** translated into Czech, suitable for training and evaluating LLM models. The data format includes `instruction` (text describing what the model should do), `input` (optional input text for the instruction), and `output` (the expected output of the model). The dataset size is approximately 50k lines, and the language is Czech. Translations were created automatically using LLM (gemma3:12b) with manual quality checks on a sample of the data. The dataset is suitable for experiments with Czech LLM models and instruction-tuning.
提供机构:
lukasplevac
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作