j-chim/pii-pile-chunk3-200000-250000-tagged
收藏Hugging Face2023-01-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/j-chim/pii-pile-chunk3-200000-250000-tagged
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: texts
sequence: string
- name: meta
struct:
- name: pile_set_name
dtype: string
- name: scores
sequence: float64
- name: avg_score
dtype: float64
- name: num_sents
dtype: int64
- name: tagged_pii_results
list:
- name: analysis_explanation
dtype: 'null'
- name: end
dtype: int64
- name: entity_type
dtype: string
- name: recognition_metadata
struct:
- name: recognizer_identifier
dtype: string
- name: recognizer_name
dtype: string
- name: score
dtype: float64
- name: start
dtype: int64
splits:
- name: train
num_bytes: 508200278
num_examples: 49999
download_size: 194434096
dataset_size: 508200278
---
# Dataset Card for "pii-pile-chunk3-200000-250000-tagged"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
j-chim
原始信息汇总
数据集概述
数据集名称
- pii-pile-chunk3-200000-250000-tagged
数据集特征
- texts:字符串序列
- meta:结构化数据,包含:
- pile_set_name:字符串类型
- scores:浮点数序列(float64)
- avg_score:浮点数类型(float64)
- num_sents:整数类型(int64)
- tagged_pii_results:列表,包含:
- analysis_explanation:空值(null)
- end:整数类型(int64)
- entity_type:字符串类型
- recognition_metadata:结构化数据,包含:
- recognizer_identifier:字符串类型
- recognizer_name:字符串类型
- score:浮点数类型(float64)
- start:整数类型(int64)
数据集拆分
- train:
- num_bytes:508200278字节
- num_examples:49999个样本
数据集大小
- download_size:194434096字节
- dataset_size:508200278字节



