hqfx/llama3_generate
收藏Hugging Face2024-05-07 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/hqfx/llama3_generate
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: functions
dtype: string
- name: conversation
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: hqfx_tulu_v2.code_alpaca
num_bytes: 1119070
num_examples: 2001
- name: hqfx_tulu_v2.oasst1
num_bytes: 581675
num_examples: 733
- name: hqfx_tulu_v2.wizardlm
num_bytes: 5080987
num_examples: 2981
- name: bz_arc13_alpaca_gpt4_chinese.train
num_bytes: 3208554
num_examples: 4996
- name: hqfx_tulu_v2.sharegpt
num_bytes: 10657388
num_examples: 7430
- name: hqfx_tulu_v2.science.qasper_truncated_4000
num_bytes: 3527939
num_examples: 221
- name: bz_arc13_wild_chat_en_zh_dedup_v2.Chinese
num_bytes: 13439068
num_examples: 8712
- name: hqfx_tulu_v2.science.scitldr_aic
num_bytes: 1347771
num_examples: 195
- name: hqfx_tulu_v2.lima
num_bytes: 139437
num_examples: 101
- name: hqfx_tulu_v2.cot
num_bytes: 5745023
num_examples: 4974
- name: hqfx_tulu_v2.science.scierc_relation
num_bytes: 73168
num_examples: 34
- name: hqfx_tulu_v2.science.scierc_ner
num_bytes: 60962
num_examples: 34
- name: hqfx_tulu_v2.science.evidence_inference
num_bytes: 740828
num_examples: 167
- name: hqfx_tulu_v2.flan_v2
num_bytes: 11076334
num_examples: 4912
- name: hqfx_tulu_v2.science.scifact_json
num_bytes: 240174
num_examples: 91
download_size: 30974924
dataset_size: 57038378
configs:
- config_name: default
data_files:
- split: hqfx_tulu_v2.code_alpaca
path: data/hqfx_tulu_v2.code_alpaca-*
- split: hqfx_tulu_v2.oasst1
path: data/hqfx_tulu_v2.oasst1-*
- split: hqfx_tulu_v2.wizardlm
path: data/hqfx_tulu_v2.wizardlm-*
- split: bz_arc13_alpaca_gpt4_chinese.train
path: data/bz_arc13_alpaca_gpt4_chinese.train-*
- split: hqfx_tulu_v2.sharegpt
path: data/hqfx_tulu_v2.sharegpt-*
- split: hqfx_tulu_v2.science.qasper_truncated_4000
path: data/hqfx_tulu_v2.science.qasper_truncated_4000-*
- split: bz_arc13_wild_chat_en_zh_dedup_v2.Chinese
path: data/bz_arc13_wild_chat_en_zh_dedup_v2.Chinese-*
- split: hqfx_tulu_v2.science.scitldr_aic
path: data/hqfx_tulu_v2.science.scitldr_aic-*
- split: hqfx_tulu_v2.lima
path: data/hqfx_tulu_v2.lima-*
- split: hqfx_tulu_v2.cot
path: data/hqfx_tulu_v2.cot-*
- split: hqfx_tulu_v2.science.scierc_relation
path: data/hqfx_tulu_v2.science.scierc_relation-*
- split: hqfx_tulu_v2.science.scierc_ner
path: data/hqfx_tulu_v2.science.scierc_ner-*
- split: hqfx_tulu_v2.science.evidence_inference
path: data/hqfx_tulu_v2.science.evidence_inference-*
- split: hqfx_tulu_v2.flan_v2
path: data/hqfx_tulu_v2.flan_v2-*
- split: hqfx_tulu_v2.science.scifact_json
path: data/hqfx_tulu_v2.science.scifact_json-*
---
提供机构:
hqfx
原始信息汇总
数据集概述
数据集特征
- functions: 数据类型为字符串。
- conversation: 包含以下子特征:
- content: 数据类型为字符串。
- role: 数据类型为字符串。
数据集划分
- hqfx_tulu_v2.code_alpaca: 字节数为1119070,样本数为2001。
- hqfx_tulu_v2.oasst1: 字节数为581675,样本数为733。
- hqfx_tulu_v2.wizardlm: 字节数为5080987,样本数为2981。
- bz_arc13_alpaca_gpt4_chinese.train: 字节数为3208554,样本数为4996。
- hqfx_tulu_v2.sharegpt: 字节数为10657388,样本数为7430。
- hqfx_tulu_v2.science.qasper_truncated_4000: 字节数为3527939,样本数为221。
- bz_arc13_wild_chat_en_zh_dedup_v2.Chinese: 字节数为13439068,样本数为8712。
- hqfx_tulu_v2.science.scitldr_aic: 字节数为1347771,样本数为195。
- hqfx_tulu_v2.lima: 字节数为139437,样本数为101。
- hqfx_tulu_v2.cot: 字节数为5745023,样本数为4974。
- hqfx_tulu_v2.science.scierc_relation: 字节数为73168,样本数为34。
- hqfx_tulu_v2.science.scierc_ner: 字节数为60962,样本数为34。
- hqfx_tulu_v2.science.evidence_inference: 字节数为740828,样本数为167。
- hqfx_tulu_v2.flan_v2: 字节数为11076334,样本数为4912。
- hqfx_tulu_v2.science.scifact_json: 字节数为240174,样本数为91。
数据集大小
- 下载大小: 30974924字节
- 数据集大小: 57038378字节
配置信息
- 配置名称: default
- 数据文件路径:
- hqfx_tulu_v2.code_alpaca: data/hqfx_tulu_v2.code_alpaca-*
- hqfx_tulu_v2.oasst1: data/hqfx_tulu_v2.oasst1-*
- hqfx_tulu_v2.wizardlm: data/hqfx_tulu_v2.wizardlm-*
- bz_arc13_alpaca_gpt4_chinese.train: data/bz_arc13_alpaca_gpt4_chinese.train-*
- hqfx_tulu_v2.sharegpt: data/hqfx_tulu_v2.sharegpt-*
- hqfx_tulu_v2.science.qasper_truncated_4000: data/hqfx_tulu_v2.science.qasper_truncated_4000-*
- bz_arc13_wild_chat_en_zh_dedup_v2.Chinese: data/bz_arc13_wild_chat_en_zh_dedup_v2.Chinese-*
- hqfx_tulu_v2.science.scitldr_aic: data/hqfx_tulu_v2.science.scitldr_aic-*
- hqfx_tulu_v2.lima: data/hqfx_tulu_v2.lima-*
- hqfx_tulu_v2.cot: data/hqfx_tulu_v2.cot-*
- hqfx_tulu_v2.science.scierc_relation: data/hqfx_tulu_v2.science.scierc_relation-*
- hqfx_tulu_v2.science.scierc_ner: data/hqfx_tulu_v2.science.scierc_ner-*
- hqfx_tulu_v2.science.evidence_inference: data/hqfx_tulu_v2.science.evidence_inference-*
- hqfx_tulu_v2.flan_v2: data/hqfx_tulu_v2.flan_v2-*
- hqfx_tulu_v2.science.scifact_json: data/hqfx_tulu_v2.science.scifact_json-*
- 数据文件路径:



