marcbishara/sarcasm-on-reddit
收藏Hugging Face2025-11-30 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/marcbishara/sarcasm-on-reddit
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: label
dtype:
class_label:
names:
'0': not_sarcastic
'1': sarcastic
- name: comment
dtype: string
- name: author
dtype: string
- name: subreddit
dtype: string
- name: score
dtype: int64
- name: ups
dtype: int64
- name: downs
dtype: int64
- name: date
dtype: string
- name: created_utc
dtype: string
- name: parent_comment
dtype: string
splits:
- name: holdout
num_bytes: 29524499.432699595
num_examples: 101083
- name: sft_train
num_bytes: 79715535.09661603
num_examples: 272922
- name: sft_validation
num_bytes: 8857379.037984777
num_examples: 30325
- name: reward_train
num_bytes: 79715535.09661603
num_examples: 272922
- name: reward_validation
num_bytes: 8857379.037984777
num_examples: 30325
- name: ppo_train
num_bytes: 79716119.260114
num_examples: 272924
- name: ppo_validation
num_bytes: 8857379.037984777
num_examples: 30325
download_size: 182555619
dataset_size: 295243826.0
configs:
- config_name: default
data_files:
- split: holdout
path: data/holdout-*
- split: sft_train
path: data/sft_train-*
- split: sft_validation
path: data/sft_validation-*
- split: reward_train
path: data/reward_train-*
- split: reward_validation
path: data/reward_validation-*
- split: ppo_train
path: data/ppo_train-*
- split: ppo_validation
path: data/ppo_validation-*
---
Copied from: Sarcasm on Reddit. https://www.kaggle.com/datasets/danofer/sarcasm
Which in turn came from:
@unpublished{SARC,
authors={Mikhail Khodak and Nikunj Saunshi and Kiran Vodrahalli},
title={A Large Self-Annotated Corpus for Sarcasm},
url={https://arxiv.org/abs/1704.05579},
year=2017
}
---
license: mit
language:
- en
---
提供机构:
marcbishara



