Nachiket-S/train_dataset

Name: Nachiket-S/train_dataset
Creator: Nachiket-S
Published: 2024-11-30 19:32:28
License: 暂无描述

Hugging Face2024-11-30 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/Nachiket-S/train_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含多个特征字段，涵盖了文本数据、性别描述、偏好描述、名词及其复数形式、名词短语、输入ID、注意力掩码、文本内容、模板、是否仅第一轮、是否必须是名词、未命名字段、更多句子、更少句子、刻板印象与反刻板印象、偏见类型、注释、匿名作者、匿名注释者、偏见文本、偏见粗俗词汇、去偏见文本、上下文、句子等。数据集分为训练集，包含463,843个样本，总大小为221,609,906字节，下载大小为42,622,588字节。

The dataset contains multiple feature fields, including text data, gender descriptions, preference descriptions, nouns and their plural forms, noun phrases, input IDs, attention masks, text content, templates, whether it is only the first turn, whether it must be a noun, unnamed fields, more sentences, fewer sentences, stereotypes and anti-stereotypes, bias types, annotations, anonymous authors, anonymous annotators, biased text, biased profane words, debiased text, context, sentences, etc. The dataset is divided into a training set, containing 463,843 samples, with a total size of 221,609,906 bytes and a download size of 42,622,588 bytes.

提供机构：

Nachiket-S

5,000+

优质数据集

54 个

任务类型

进入经典数据集