sohomghosh/Generator-Guided-Crowd-Reaction-Assessment
收藏Hugging Face2024-06-05 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/sohomghosh/Generator-Guided-Crowd-Reaction-Assessment
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
task_categories:
- text-classification
language:
- en
tags:
- finance
size_categories:
- 1K<n<10K
---
---
license: cc-by-nc-sa-4.0
---
About Dataset
In the realm of social media, understanding and predicting post reach is a significant challenge. Our paper presents a Crowd Reaction AssessMent (CReAM) task designed to estimate if a given social media post will receive more reaction than another, a particularly essential task for digital marketers and content writers. We introduce the Crowd Reaction Estimation Dataset (CRED), consisting of pairs of tweets from The White House with comparative measures of retweet count.
Column Description
URL_x - URL for the first tweet
Tweet_id_x - Tweet id of the first tweet
Datetime_x - Timestamp when the first tweet was posted
cleaned_tweet_text_x - cleaned version of the first tweet. The content has been removed to comply with Twitter's terms & conditions
retweet_count_x - retweet count of the first tweet
claude_x_cleaned - response when claude is prompted with the first tweet
chatgpt_x_cleaned - response when chatgpt is prompted with the first tweet
flanul2_x - response when flan-ul2 is prompted with the first tweet
URL_y - URL for the second tweet
Tweet_id_y - Tweet id of the second tweet
Datetime_y - Timestamp when the second tweet was posted
cleaned_tweet_text_y - cleaned version of the second tweet. The content has been removed to comply with Twitter's terms & conditions
retweet_count_y - retweet count of the second tweet
claude_y_cleaned - response when claude is prompted with the second tweet
chatgpt_y_cleaned - response when chatgpt is prompted with the second tweet
flanul2_y - response when flan-ul2 is prompted with the second tweet
category - category of the tweets
retweet_count_x_more_y - It is 1 if retweet count of first tweet is more second. Else it is 0.
```bibtex
@inproceedings{10.1145/3589335.3651512,
author = {Ghosh, Sohom and Chen, Chung-Chi and Naskar, Sudip Kumar},
title = {Generator-Guided Crowd Reaction Assessment},
year = {2024},
isbn = {9798400701726},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3589335.3651512},
doi = {10.1145/3589335.3651512},
abstract = {In the realm of social media, understanding and predicting post reach is a significant challenge. This paper presents a Crowd Reaction AssessMent (CReAM) task designed to estimate if a given social media post will receive more reaction than another, a particularly essential task for digital marketers and content writers. We introduce the Crowd Reaction Estimation Dataset (CRED), consisting of pairs of tweets from The White House with comparative measures of retweet count. The proposed Generator-Guided Estimation Approach (GGEA) leverages generative Large Language Models (LLMs), such as ChatGPT, FLAN-UL2, and Claude, to guide classification models for making better predictions. Our results reveal that a fine-tuned FLANG-RoBERTa model, utilizing a cross-encoder architecture with tweet content and responses generated by Claude, performs optimally. We further use a T5-based paraphraser to generate paraphrases of a given post and demonstrate GGEA's ability to predict which post will elicit the most reactions. We believe this novel application of LLMs provides a significant advancement in predicting social media post reach.},
booktitle = {Companion Proceedings of the ACM on Web Conference 2024},
pages = {597–600},
numpages = {4},
keywords = {crowd reaction assessment, large language models, natural language processing, social media},
location = {<conf-loc>, <city>Singapore</city>, <country>Singapore</country>, </conf-loc>},
series = {WWW '24}
}
```
提供机构:
sohomghosh
原始信息汇总
数据集概述
基本信息
- 许可证: cc-by-nc-sa-4.0
- 任务类别: 文本分类
- 语言: 英语
- 标签: 金融
- 数据集大小: 1K<n<10K
数据集描述
- 名称: Crowd Reaction Estimation Dataset (CRED)
- 目的: 评估和预测社交媒体帖子(特别是推文)的反应量,比较两个帖子哪个会收到更多反应。
- 内容: 包含来自The White House的成对推文,每对推文有比较性的转发计数。
数据集结构
- 列描述:
- URL_x, Tweet_id_x, Datetime_x: 第一条推文的相关信息
- cleaned_tweet_text_x, retweet_count_x: 第一条推文的文本内容和转发计数
- claude_x_cleaned, chatgpt_x_cleaned, flanul2_x: 不同模型对第一条推文的响应
- URL_y, Tweet_id_y, Datetime_y: 第二条推文的相关信息
- cleaned_tweet_text_y, retweet_count_y: 第二条推文的文本内容和转发计数
- claude_y_cleaned, chatgpt_y_cleaned, flanul2_y: 不同模型对第二条推文的响应
- category: 推文的类别
- retweet_count_x_more_y: 指示第一条推文的转发计数是否多于第二条(1表示是,0表示否)
引用信息
- 作者: Ghosh, Sohom and Chen, Chung-Chi and Naskar, Sudip Kumar
- 标题: Generator-Guided Crowd Reaction Assessment
- 年份: 2024
- 出版: Association for Computing Machinery
- 会议: Companion Proceedings of the ACM on Web Conference 2024
- 页码: 597–600
- 关键词: 人群反应评估, 大型语言模型, 自然语言处理, 社交媒体



