five

ptaszynski/PolishCyberbullyingDataset

收藏
Hugging Face2023-12-25 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ptaszynski/PolishCyberbullyingDataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 language: - pl tags: - cyberbullying - hate-speech pretty_name: PolishCyberbullyingDataset --- # Expert-annotated dataset to study cyberbullying in Polish language This the first publically available expert-annotated dataset containing annotations of cyberbullying and hate-speech in Polish language. Please, read [the paper](https://www.mdpi.com/2306-5729/9/1/1) about the dataset for all necessary details. ## Model The classification model which achieved the highest classification results for the dataset is also released under the following URL. [Polbert-CB - Polish BERT trained for Automatic Cyberbullying Detection](https://huggingface.co/ptaszynski/bert-base-polish-cyberbullying) ## Citations Whenever you use the dataset, please, cite it using the following citation to [the paper](https://www.mdpi.com/2306-5729/9/1/1). ``` @article{ptaszynski2023expert, title={Expert-Annotated Dataset to Study Cyberbullying in Polish Language}, author={Ptaszynski, Michal and Pieciukiewicz, Agata and Dybala, Pawel and Skrzek, Pawel and Soliwoda, Kamil and Fortuna, Marcin and Leliwa, Gniewosz and Wroczynski, Michal}, journal={Data}, volume={9}, number={1}, pages={1}, year={2023}, publisher={MDPI} } ``` ## Licences The dataset is licensed under [CC BY 4.0](http://creativecommons.org/licenses/by/4.0/), or Creative Commons Attribution 4.0 International License. <a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a> ## Bundle The whole bundle containing (1) the old version of the dataset, (2) current version of the dataset, as well as (3) the model trained on this dataset can be found on [Zenodo](https://zenodo.org/records/7188178). ## Author Michal Ptaszynski - contact me on: - Twitter: [@mich_ptaszynski](https://twitter.com/mich_ptaszynski) - GitHub: [ptaszynski](https://github.com/ptaszynski) - LinkedIn: [michalptaszynski](https://jp.linkedin.com/in/michalptaszynski) - HuggingFace: [ptaszynski](https://huggingface.co/ptaszynski)
提供机构:
ptaszynski
原始信息汇总

PolishCyberbullyingDataset

概述

PolishCyberbullyingDataset 是一个公开可用的专家标注数据集,用于研究波兰语中的网络霸凌和仇恨言论。

语言

  • 波兰语

标签

  • 网络霸凌
  • 仇恨言论

许可证

数据集遵循 CC BY 4.0 许可证,即 Creative Commons Attribution 4.0 International License。

引用

使用该数据集时,请引用以下论文:

@article{ptaszynski2023expert, title={Expert-Annotated Dataset to Study Cyberbullying in Polish Language}, author={Ptaszynski, Michal and Pieciukiewicz, Agata and Dybala, Pawel and Skrzek, Pawel and Soliwoda, Kamil and Fortuna, Marcin and Leliwa, Gniewosz and Wroczynski, Michal}, journal={Data}, volume={9}, number={1}, pages={1}, year={2023}, publisher={MDPI} }

作者

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作