Dataset used for ethical discussions detection
收藏DataCite Commons2026-05-03 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19651108
下载链接
链接失效反馈官方服务:
资源简介:
This database contains the input and output files used for the analysis of ethical discussions in GenAI open-source repositories hosted on GitHub. The files are organized into six groups:
Lexicons (lexico*.csv)
Anonymized datasets by repository after the final lexicon expansion (*_crescimento.csv)
Anonymized lists of terms by repository throughout the lexicon expansion stages (*_base.csv, *_literatura-adhoc.csv, *_rsl-etico.csv)
Anonymized input files for BERTopic analysis before the final lexicon expansion (*-bertopic.csv)
Anonymized input files for BERTopic analysis after the final lexicon expansion (*_pre-expansao.csv)
BERTopic outputs before the final expansion (*_expansao*.csv)
BERTopic outputs after the final expansion (*_rsl*.csv)
提供机构:
Zenodo
创建时间:
2026-05-03



