Generative AI Content Moderation Policies & Public Discussion Dataset
收藏DataCite Commons2025-06-06 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/Generative_AI_Content_Moderation_Policies_Public_Discussion_Dataset/29257187
下载链接
链接失效反馈官方服务:
资源简介:
IntroductionThis is a dataset of the USENIX Security '25 paper: <b>''I Cannot Write This Because It Violates Our Content Policy'': Understanding Content Moderation Policies and User Experiences in Generative AI Products.</b>Structure and file overviewThis dataset includes two parts:[Policy Dataset]<i> Folder /Study1_Policy_Analysis</i>:<i>Folder /Policy_Datase</i><i>t c</i>ontains a collection of screenshots of content moderation policy pages in Generative AI online tools. All screenshots are in <i>.pdf</i> format, and are placed into three folders under this directory based on the type of pages. <i>info.pdf</i> contains information on the Generative AI online tools and page URLs of the screenshots.<i>Folder /Coded_Policy_Segments</i> contains analysis on the policy dataset. <i>coded_policy.xlsx</i> contains annotated policy segments. <i>codebook_study1.pdf</i> contains the codebook for data analysis.[Reddit Dataset]<i> Folder /Study2_Reddit_Analysis</i>:<i>Folder /Reddit_Dataset</i> contains a collection of Reddit posts discussing content moderation in Generative AI online tools. All collected posts can be found in <i>all_cleaned_posts.csv. </i>In <i>Folder ./By_Subreddits, </i>the dataset is divided into seven<i> .csv </i>files by different subreddits where the posts were collected.<i>codebook_study2.pdf</i> contains the codebook for the data analysis we presented in the paper.
提供机构:
figshare
创建时间:
2025-06-06



