Rewritten Media Bias News Headlines
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11162893
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was used in the paper "Rewriting Bias: Mitigating Media Bias in News Recommender Systems through Automated Rewriting" by Qin Ruan, Jin Xu, Susan Leavy, Brian Mac Namee, and Ruihai Dong, presented at the 32nd ACM Conference on User Modeling, Adaptation and Personalization (UMAP'24), July 1-4, 2024, in Cagliari, Italy. (ACM, New York, NY, USA, 10.1145/3627043.3659541).
The dataset includes seven distinct rewritten versions of news headlines generated using the following methods proposed in the paper: RADJ, RADV, RNOUN, RVERN, RALL, RG3.5, and RG4.0.
The rewriting approaches are categorised into two main categories: Word Replacement approaches and Large Language Models approaches.
1.Sentence Rewriting Using Word Replacement :
Replace Adjectives (RADJ): This approach replaces adjectives identified as contributing to bias with neutral or opposite ones.
Replace Adverbs (RADV): This approach replaces adverbs identified as contributing to bias with neutral or opposite ones.
Replace Nouns (RNOUN): This approach replaces nouns identified as contributing to bias with neutral or more general ones.
Replace Verbs (RVERB): This approach replaces verbs identified as contributing to bias with neutral or more factual ones.
Replace All (RALL): This is a combination of all the above approaches.
2.Sentence Rewriting Using Large language Models:
Sentence Rewriting using GPT-3.5 (RG3.5): This method rephrases sentences using GPT-3.5 based on a specific prompt.
Sentence Rewriting using GPT-4.0 (RG4.0): This method rephrases sentences using GPT-4.0 based on a specific prompt.
The dataset consists of original news headlines and their corrsponding rewritten versions prodcued by each of the seven approaches mentioned above. Each rewritten version aims to reduce bias while maintaining the original meaning.
Dataset Structure:
The dataset consists of a single file with the following columns:
NewsId: A unique identifier for each news article.
Original: The original headline of the news article.
RG3.5: The headline rewritten using the GPT-3.5 model.
RG4.0: The headline rewritten using the GPT-4.0 model.
RADJ: The headline with adjectives replaced to reduce bias.
RADV: The headline with adverbs replaced to reduce bias.
RVERB: The headline with verbs replaced to reduce bias.
RNOUN: The headline with nouns replaced to reduce bias.
RALL: The headline with adjectives, adverbs, verbs. noun replaced to reduce bias.
Usage:
This dataset can be used to study the impact of different sentence rewriting approaches on reudicng bias in enws healines. It is also suitable for researchers interested in natural language processing, media bias reduction, and the application of large language models in generative AI.
For more information, please visit https://github.com/ruanqin0706/MediaBiasinNewsRec
Contact Information: qin.ruan@ucdconnect.ie
创建时间:
2024-07-02



