five

Replication Data for: Extractive versus Generative Language Models for Political Conflict Text Classification

收藏
DataONE2025-10-23 更新2025-11-01 收录
下载链接:
https://search.dataone.org/view/sha256:89d2a090d985871718cf79c1711dadeac3838d670417929eceba6c95701b47ef
下载链接
链接失效反馈
官方服务:
资源简介:
This replication package contains all data, code, and pre-computed model outputs necessary to reproduce the 9 tables and 4 figures in the manuscript, \"Extractive versus Generative Language Models for Political Conflict Text Classification,\" forthcoming in Political Analysis. The study provides a comprehensive comparison between specialized, fine-tuned \"extractive\" models (e.g., ConfliBERT) and general-purpose \"generative\" large language models (LLMs) like Llama and Gemma. The models are evaluated on three core political text analysis tasks: binary conflict classification, multi-label event classification, and named entity recognition (NER). The package is fully automated via a master run.sh script and includes analysis code written in both R and Python. It is structured to support two modes of operation: Verification (Default): A fast run (under 2 minutes) that uses the provided pre-computed model outputs to generate all manuscript results. Full Recreation (Optional): A computationally expensive run (several hours) that reproduces all model predictions from the raw source data, requiring a local Ollama setup and compatible hardware. For complete instructions on setting up the computational environment and executing the scripts, please consult the README.md file included in this package.
创建时间:
2025-10-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作