reubenjohn/stackoverflow-unified-text-open-status-classification-sample
收藏Hugging Face2022-11-25 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/reubenjohn/stackoverflow-unified-text-open-status-classification-sample
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: PostId
dtype: int64
- name: PostCreationDate
dtype: string
- name: OwnerUserId
dtype: int64
- name: OwnerCreationDate
dtype: string
- name: ReputationAtPostCreation
dtype: int64
- name: OwnerUndeletedAnswerCountAtPostTime
dtype: int64
- name: Title
dtype: string
- name: BodyMarkdown
dtype: string
- name: Tag1
dtype: string
- name: Tag2
dtype: string
- name: Tag3
dtype: string
- name: Tag4
dtype: string
- name: Tag5
dtype: string
- name: PostClosedDate
dtype: string
- name: OpenStatus
dtype: string
- name: unified_texts
dtype: string
- name: OpenStatus_id
dtype: int64
splits:
- name: train
num_bytes: 216256197
num_examples: 112217
- name: valid
num_bytes: 43398940
num_examples: 22443
- name: test
num_bytes: 43398940
num_examples: 22443
download_size: 163036345
dataset_size: 303054077
---
# Dataset Card for "stackoverflow-open-status-classification"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
reubenjohn
原始信息汇总
数据集概述
数据集名称
"stackoverflow-open-status-classification"
数据集特征
- PostId: 整数类型 (int64)
- PostCreationDate: 字符串类型 (string)
- OwnerUserId: 整数类型 (int64)
- OwnerCreationDate: 字符串类型 (string)
- ReputationAtPostCreation: 整数类型 (int64)
- OwnerUndeletedAnswerCountAtPostTime: 整数类型 (int64)
- Title: 字符串类型 (string)
- BodyMarkdown: 字符串类型 (string)
- Tag1 至 Tag5: 字符串类型 (string)
- PostClosedDate: 字符串类型 (string)
- OpenStatus: 字符串类型 (string)
- unified_texts: 字符串类型 (string)
- OpenStatus_id: 整数类型 (int64)
数据集分割
- 训练集 (train):
- 数据量: 216256197 字节
- 示例数量: 112217
- 验证集 (valid):
- 数据量: 43398940 字节
- 示例数量: 22443
- 测试集 (test):
- 数据量: 43398940 字节
- 示例数量: 22443
数据集大小
- 下载大小: 163036345 字节
- 数据集总大小: 303054077 字节



