Replication Data for: Extractive versus Generative Language Models for Political Conflict Text Classification

DataONE2025-10-23 更新2025-11-01 收录

下载链接：

https://search.dataone.org/view/sha256:89d2a090d985871718cf79c1711dadeac3838d670417929eceba6c95701b47ef

下载链接

链接失效反馈

官方服务：

资源简介：

This replication package contains all data, code, and pre-computed model outputs necessary to reproduce the 9 tables and 4 figures in the manuscript, \"Extractive versus Generative Language Models for Political Conflict Text Classification,\" forthcoming in Political Analysis. The study provides a comprehensive comparison between specialized, fine-tuned \"extractive\" models (e.g., ConfliBERT) and general-purpose \"generative\" large language models (LLMs) like Llama and Gemma. The models are evaluated on three core political text analysis tasks: binary conflict classification, multi-label event classification, and named entity recognition (NER). The package is fully automated via a master run.sh script and includes analysis code written in both R and Python. It is structured to support two modes of operation: Verification (Default): A fast run (under 2 minutes) that uses the provided pre-computed model outputs to generate all manuscript results. Full Recreation (Optional): A computationally expensive run (several hours) that reproduces all model predictions from the raw source data, requiring a local Ollama setup and compatible hardware. For complete instructions on setting up the computational environment and executing the scripts, please consult the README.md file included in this package.

创建时间：

2025-10-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集