Bias detection in polish journalism

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://doi.org/10.7910/DVN/WNR9KO

下载链接

链接失效反馈

官方服务：

资源简介：

The dataset contains a list of articles gathered from various polish media outlets on many topics. The articles have been judged for subjectivity using 4 LLMs: "bielik": { "repo": "speakleash/Bielik-11B-v3.0-Instruct-GGUF", "filename": "Bielik-11B-v3.0-Instruct.Q4_K_M.gguf", "description": "Bielik 11B v3.0 - Specialized for Polish." }, "llama3": { "repo": "QuantFactory/Meta-Llama-3-8B-Instruct-GGUF", "filename": "Meta-Llama-3-8B-Instruct.Q4_K_M.gguf", "description": "Meta Llama 3 8B Instruct - Strong general capability." }, "mistral": { "repo": "TheBloke/OpenHermes-2.5-Mistral-7B-GGUF", "filename": "openhermes-2.5-mistral-7b.Q4_K_M.gguf", "description": "OpenHermes 2.5 (Mistral 7B) - Excellent instruction following." }, "deepseek": { "repo": "bartowski/DeepSeek-R1-Distill-Llama-8B-GGUF", "filename": "DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf", "description": "DeepSeek R1 Distill Llama 8B - Strong reasoning model." } The average review contains all of the reviews with reasoning and score, as well as a choice, which model (based on the average score) was closest to it. The individual files with modelname_trim syntax are a result of using the model as an LLM-as-a-judge, for determining which model gave the best response. A README and regenerator.py script have been provided to recreate the text used in the articles. Other scripting (training, LLM server etc.) can be found in the project's repo: https://github.com/JakubLegutko/SentimentDetector

创建时间：

2026-02-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集