phuong123/icd_icf_en_vi

Name: phuong123/icd_icf_en_vi
Creator: phuong123
Published: 2024-06-16 07:26:16
License: 暂无描述

Hugging Face2024-06-16 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/phuong123/icd_icf_en_vi

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含英语和越南语两种语言的数据，分为三个部分：icd_icf、dictionary和adapt。icd_icf部分包含21555个示例，占用2317367字节；dictionary部分包含16893个示例，占用1088267字节；adapt部分包含358796个示例，占用118504077字节。整个数据集的下载大小为74885719字节，总大小为121909711字节。

The dataset contains data in two languages: English and Vietnamese, divided into three parts: icd_icf, dictionary, and adapt. The icd_icf part includes 21555 examples, occupying 2317367 bytes; the dictionary part includes 16893 examples, occupying 1088267 bytes; the adapt part includes 358796 examples, occupying 118504077 bytes. The total download size of the dataset is 74885719 bytes, and the total size is 121909711 bytes.

提供机构：

phuong123

原始信息汇总

数据集概述

数据集特征

en: 数据类型为字符串
vi: 数据类型为字符串

数据集分割

train:
- 示例数量: 15088
- 字节数: 1624821
test:
- 示例数量: 3233
- 字节数: 342331
validation:
- 示例数量: 3234
- 字节数: 350215
others:
- 示例数量: 13806
- 字节数: 1215214
adapt_train:
- 示例数量: 287036
- 字节数: 94802997.3739172
adapt_val:
- 示例数量: 71760
- 字节数: 23701079.62608279

数据集大小

下载大小: 75079145字节
数据集总大小: 122036658.0字节

配置文件

default:
- train: 文件路径格式为 data/train-*
- test: 文件路径格式为 data/test-*
- validation: 文件路径格式为 data/validation-*
- others: 文件路径格式为 data/others-*
- adapt_train: 文件路径格式为 data/adapt_train-*
- adapt_val: 文件路径格式为 data/adapt_val-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集