nayohan/finance-alpaca-ko

Name: nayohan/finance-alpaca-ko
Creator: nayohan
Published: 2024-07-17 17:51:48
License: 暂无描述

Hugging Face2024-07-17 更新2024-07-22 收录

下载链接：

https://hf-mirror.com/datasets/nayohan/finance-alpaca-ko

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个原始翻译数据集，包含由模型生成的重复句子，因此需要进一步过滤。数据集包含四个特征：instruction、text、input和output，均为字符串类型。数据集只有一个训练集分割，包含68,912个样本，总大小为48,029,971字节。数据集的语言为韩语（ko），标签为金融（finance）。

This dataset is a raw translated dataset and contains repetitive sentences generated by the model, so it needs to be filtered. The dataset contains four features: instruction, text, input, and output, all of which are of string type. The dataset has only one training split, containing 68,912 samples with a total size of 48,029,971 bytes. The language of the dataset is Korean (ko), and the tag is finance.

提供机构：

nayohan

原始信息汇总

数据集概述

数据集信息

特征:
- instruction: 字符串类型
- text: 字符串类型
- input: 字符串类型
- output: 字符串类型
分割:
- train: 包含68912个样本，总大小为48029971字节
下载大小: 24665624字节
数据集大小: 48029971字节
配置:
- default: 包含训练数据文件，路径为data/train-*
语言: 韩语
标签: 金融

数据集来源

该数据集是基于gbharti/finance-alpaca翻译得到的。
翻译使用了nayohan/llama3-instrucTrans-enko-8b模型。

数据集特性

该数据集是原始翻译数据集，包含由模型生成的重复句子，需要进行过滤处理。

5,000+

优质数据集

54 个

任务类型

进入经典数据集