kimnt93/vietdish
收藏Hugging Face2024-04-20 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/kimnt93/vietdish
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 16035063
num_examples: 6652
download_size: 6802487
dataset_size: 16035063
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
**Dataset Name:** vietdish
**Description:** The vietdish dataset is generated by collecting instructions and input from Cooky (https://www.cooky.vn/) and generating output using GPT-3.5. It contains Vietnamese cooking instructions and corresponding generated responses.
**Source:** [vietdish on Hugging Face Datasets](https://huggingface.co/datasets/kimnt93/vietdish)
**Instruction and Input Source:** [Cooky](https://www.cooky.vn/)
**Method:** Instructions and input were collected from Cooky, and GPT-3.5 was used to generate responses based on the collected instructions and input.
**License:** Please refer to the license information provided by the original source.
---
**Python Script to Download the Dataset:**
```python
from datasets import load_dataset
# Load the vietdish dataset
dataset = load_dataset("kimnt93/vietdish")
# Print some basic information about the dataset
print("Dataset Name:", dataset.name)
print("Number of Samples:", len(dataset))
# Example usage: accessing a sample from the dataset
sample = dataset[0]
print("Example Sample:", sample)
```
This Python script uses the `datasets` library from Hugging Face to download and access the vietdish dataset. You can run this script in your Python environment to download the dataset and print some basic information about it.
Ensure you have the `datasets` library installed (`pip install datasets`) before running the script.
Let me know if you need further assistance!
提供机构:
kimnt93
原始信息汇总
数据集概述
数据集名称
vietdish
数据集描述
该数据集通过收集Cooky网站的烹饪指令和输入,并使用GPT-3.5生成相应的输出。数据集包含越南烹饪指令及其对应的生成响应。
数据集特征
- instruction:字符串类型
- input:字符串类型
- output:字符串类型
数据集拆分
- train:
- 样本数量:6652
- 数据大小:16035063字节
数据集大小
- 下载大小:6802487字节
- 数据集大小:16035063字节
配置
- config_name: default
- data_files:
- split: train
- path: data/train-*



