sohomghosh/Generator-Guided-Crowd-Reaction-Assessment

Name: sohomghosh/Generator-Guided-Crowd-Reaction-Assessment
Creator: sohomghosh
Published: 2024-06-05 15:28:41
License: 暂无描述

Hugging Face2024-06-05 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/sohomghosh/Generator-Guided-Crowd-Reaction-Assessment

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-nc-sa-4.0 task_categories: - text-classification language: - en tags: - finance size_categories: - 1K<n<10K --- --- license: cc-by-nc-sa-4.0 --- About Dataset In the realm of social media, understanding and predicting post reach is a significant challenge. Our paper presents a Crowd Reaction AssessMent (CReAM) task designed to estimate if a given social media post will receive more reaction than another, a particularly essential task for digital marketers and content writers. We introduce the Crowd Reaction Estimation Dataset (CRED), consisting of pairs of tweets from The White House with comparative measures of retweet count. Column Description URL_x - URL for the first tweet Tweet_id_x - Tweet id of the first tweet Datetime_x - Timestamp when the first tweet was posted cleaned_tweet_text_x - cleaned version of the first tweet. The content has been removed to comply with Twitter's terms & conditions retweet_count_x - retweet count of the first tweet claude_x_cleaned - response when claude is prompted with the first tweet chatgpt_x_cleaned - response when chatgpt is prompted with the first tweet flanul2_x - response when flan-ul2 is prompted with the first tweet URL_y - URL for the second tweet Tweet_id_y - Tweet id of the second tweet Datetime_y - Timestamp when the second tweet was posted cleaned_tweet_text_y - cleaned version of the second tweet. The content has been removed to comply with Twitter's terms & conditions retweet_count_y - retweet count of the second tweet claude_y_cleaned - response when claude is prompted with the second tweet chatgpt_y_cleaned - response when chatgpt is prompted with the second tweet flanul2_y - response when flan-ul2 is prompted with the second tweet category - category of the tweets retweet_count_x_more_y - It is 1 if retweet count of first tweet is more second. Else it is 0. ```bibtex @inproceedings{10.1145/3589335.3651512, author = {Ghosh, Sohom and Chen, Chung-Chi and Naskar, Sudip Kumar}, title = {Generator-Guided Crowd Reaction Assessment}, year = {2024}, isbn = {9798400701726}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3589335.3651512}, doi = {10.1145/3589335.3651512}, abstract = {In the realm of social media, understanding and predicting post reach is a significant challenge. This paper presents a Crowd Reaction AssessMent (CReAM) task designed to estimate if a given social media post will receive more reaction than another, a particularly essential task for digital marketers and content writers. We introduce the Crowd Reaction Estimation Dataset (CRED), consisting of pairs of tweets from The White House with comparative measures of retweet count. The proposed Generator-Guided Estimation Approach (GGEA) leverages generative Large Language Models (LLMs), such as ChatGPT, FLAN-UL2, and Claude, to guide classification models for making better predictions. Our results reveal that a fine-tuned FLANG-RoBERTa model, utilizing a cross-encoder architecture with tweet content and responses generated by Claude, performs optimally. We further use a T5-based paraphraser to generate paraphrases of a given post and demonstrate GGEA's ability to predict which post will elicit the most reactions. We believe this novel application of LLMs provides a significant advancement in predicting social media post reach.}, booktitle = {Companion Proceedings of the ACM on Web Conference 2024}, pages = {597–600}, numpages = {4}, keywords = {crowd reaction assessment, large language models, natural language processing, social media}, location = {<conf-loc>, <city>Singapore</city>, <country>Singapore</country>, </conf-loc>}, series = {WWW '24} } ```

提供机构：

sohomghosh

原始信息汇总

数据集概述

基本信息

许可证: cc-by-nc-sa-4.0
任务类别: 文本分类
语言: 英语
标签: 金融
数据集大小: 1K<n<10K

数据集描述

名称: Crowd Reaction Estimation Dataset (CRED)
目的: 评估和预测社交媒体帖子（特别是推文）的反应量，比较两个帖子哪个会收到更多反应。
内容: 包含来自The White House的成对推文，每对推文有比较性的转发计数。

数据集结构

列描述:
- URL_x, Tweet_id_x, Datetime_x: 第一条推文的相关信息
- cleaned_tweet_text_x, retweet_count_x: 第一条推文的文本内容和转发计数
- claude_x_cleaned, chatgpt_x_cleaned, flanul2_x: 不同模型对第一条推文的响应
- URL_y, Tweet_id_y, Datetime_y: 第二条推文的相关信息
- cleaned_tweet_text_y, retweet_count_y: 第二条推文的文本内容和转发计数
- claude_y_cleaned, chatgpt_y_cleaned, flanul2_y: 不同模型对第二条推文的响应
- category: 推文的类别
- retweet_count_x_more_y: 指示第一条推文的转发计数是否多于第二条（1表示是，0表示否）

引用信息

作者: Ghosh, Sohom and Chen, Chung-Chi and Naskar, Sudip Kumar
标题: Generator-Guided Crowd Reaction Assessment
年份: 2024
出版: Association for Computing Machinery
会议: Companion Proceedings of the ACM on Web Conference 2024
页码: 597–600
关键词: 人群反应评估, 大型语言模型, 自然语言处理, 社交媒体

5,000+

优质数据集

54 个

任务类型

进入经典数据集