news-media-bias.data.json
收藏DataCite Commons2023-10-24 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/news-media-bias_data_json/24422122/1
下载链接
链接失效反馈官方服务:
资源简介:
The prevalence of bias in the news media has become a critical issue, affecting public perception on a range of important topics such as political views, health, insurance, resource distributions, religion, race, age, gender, occupation, and climate change. The media has a moral responsibility to ensure accurate information dissemination and to increase awareness about important issues and the potential risks associated with them. This highlights the need for a solution that can help mitigate against the spread of false or misleading information and restore public trust in the media.<b>Data description</b>This is a dataset for news media bias covering different dimensions of the biases: <i>political, hate speech, political, toxicity, sexism, ageism, gender identity, gender discrimination, race/ethnicity, climate change, occupation, spirituality, </i>which makes it a unique contribution.The dataset used for this project <b>does not contain any personally identifiable information (PII)</b>.<b>Data Format:</b>- ID: Numeric unique identifier.- Text: Main content.- Dimension: Categorical descriptor of the text.- Biased_Words: List of words considered biased.- Aspect: Specific topic within the text.- Label: Bias True/False value- Aggregate Label: Calculated through multiple weighted formulae<br><b>Annotation Scheme:</b>1. Bias Label: Indicate the presence/absence of bias (e.g., no bias, mild, strong).2. Words/Phrases Level Biases: Identify specific biased words/phrases.3. Subjective Bias (Aspect): Capture biases related to content aspects.<br><b>Annotation Process:</b>Manual Labeling --> LLM based labelling -->Semi-Supervised Learning --> Human Verifications (iterative process)<br>The scheme employs a mix of manual labeling, GPT-based labeling, human verification, and semi-supervised learning for refined and accurate annotation.We want to offer open and free access to dataset, ensuring a wide reach to researchers and AI practitioners across the world. The dataset should be user-friendly to use and uploading and accessing data should be straightforward, to facilitate usage.<br>
提供机构:
figshare
创建时间:
2023-10-23



