five

bermaneh/codeswitching-sentiment-bias-exp3-perception-v1

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/bermaneh/codeswitching-sentiment-bias-exp3-perception-v1
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是实验3的研究成果,主要探讨通过语言交换(英语和西班牙语中最高SHAP值的单词)来观察大型语言模型(LLM)对推特作者的个人感知变化。研究问题聚焦于这种语言交换是否会影响LLM对作者的感知,并分析这种变化是否与之前的NLP偏见信号相关。数据集包含3221行完整数据,使用了meta-llama/Llama-3.3-70B-Instruct模型。数据集中包含原始和扰动后的推特句子、交换的单词及其翻译、SHAP排名、情感变化、LLM对作者的各种评分(如温暖度、专业性、可信度、攻击性等)及其变化,以及对原始和扰动描述的代码本分类(如是否专业、受教育程度、情感表达、攻击性、文化/种族背景提及、语言使用提及等)。

This dataset is the result of Experiment 3, which explores how swapping the highest-SHAP word between English and Spanish changes the perception of the tweet author by a large language model (LLM). The research question focuses on whether this language swap affects the LLMs perception of the author and whether this shift correlates with NLP bias signals from previous experiments. The dataset contains 3221 complete rows and uses the meta-llama/Llama-3.3-70B-Instruct model. It includes original and perturbed tweet sentences, the swapped word and its translation, SHAP rank, sentiment change, LLM ratings of the author (e.g., warmth, professionalism, credibility, aggression) and their changes, as well as codebook classifications for original and perturbed descriptions (e.g., professional, educated, emotional, aggressive, cultural/ethnic reference, language reference).
提供机构:
bermaneh
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作