kimnt93/viet-sharegpt4

Name: kimnt93/viet-sharegpt4
Creator: kimnt93
Published: 2024-04-20 08:39:10
License: 暂无描述

Hugging Face2024-04-20 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/kimnt93/viet-sharegpt4

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string - name: history sequence: sequence: string splits: - name: train num_bytes: 13157690 num_examples: 668 download_size: 5496667 dataset_size: 13157690 configs: - config_name: default data_files: - split: train path: data/train-* --- **Dataset Name:** viet-sharegpt4 **Description:** The viet-sharegpt4 dataset is collected from the OpenChat project's openchat_sharegpt4_dataset. It contains data specifically curated for Vietnamese language tasks. **Source:** [viet-sharegpt4 on Hugging Face Datasets](https://huggingface.co/datasets/kimnt93/viet-sharegpt4) **Original Source:** [openchat_sharegpt4_dataset on Hugging Face Datasets](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset) **License:** Please refer to the license information provided by the original source. --- **Python Script to Download the Dataset:** ```python from datasets import load_dataset # Load the viet-sharegpt4 dataset dataset = load_dataset("kimnt93/viet-sharegpt4") # Print some basic information about the dataset print("Dataset Name:", dataset.name) print("Number of Samples:", len(dataset)) ``` This Python script uses the `datasets` library from Hugging Face to download and access the viet-sharegpt4 dataset. You can run this script in your Python environment to download the dataset and print some basic information about it. Make sure you have the `datasets` library installed (`pip install datasets`) before running the script. Let me know if you need further assistance!

提供机构：

kimnt93

原始信息汇总

数据集概述

数据集名称

viet-sharegpt4

数据集描述

该数据集是从OpenChat项目的openchat_sharegpt4_dataset收集而来，专门为越南语言任务定制。

数据集特征

instruction：数据类型为字符串。
input：数据类型为字符串。
output：数据类型为字符串。
history：数据类型为字符串序列。

数据集拆分

train：包含668个样本，总大小为13157690字节。

数据集大小

下载大小：5496667字节
数据集总大小：13157690字节

配置

config_name：default
data_files：
- split：train
- path：data/train-*

数据集来源

原始数据集：openchat_sharegpt4_dataset on Hugging Face Datasets
当前数据集：viet-sharegpt4 on Hugging Face Datasets

许可证

请参考原始数据集提供的许可证信息。

5,000+

优质数据集

54 个

任务类型

进入经典数据集