Code for: AI-Powered (Finance) Scholarship
收藏DataCite Commons2026-02-12 更新2026-05-03 收录
下载链接:
https://www.openicpsr.org/openicpsr/project/240109/view
下载链接
链接失效反馈官方服务:
资源简介:
This paper describes a process for generating academic papers using large language models (LLMs) and demonstrates this process’ efficacy by producing hundreds of complete papers on stock return predictability, a topic well-suited for our illustration. After mining over 30,000 potential return predictors from accounting data, we generate “template reports” for 95 signals passing rigorous criteria from the Novy-Marx and Velikov (2024) “Assaying Anomalies” protocol. These templates detail signal performance predicting returns using a wide array of tests and benchmark performance against more than 200 documented anomalies. Finally, for each template we use state-of-the-art LLMs to generate multiple complete versions of academic papers with distinct theoretical justifications for the observed return predictability, incorporating citations to literature supporting their respective claims. This experiment illustrates AI’s potential for enhancing financial research efficiency, but also serves as a cautionary tale, illustrating how it can be abused to industrialize HARKing (Hypothesizing After Results are Known).<br>
本文阐述了一种利用大语言模型(Large Language Models,LLMs)生成学术论文的流程,并通过产出数百篇关于股票收益可预测性的完整论文验证了该流程的有效性——该主题非常适合作为本文的演示案例。我们从会计数据中挖掘了3万余个潜在收益预测因子,并基于Novy-Marx与Velikov(2024)《Assaying Anomalies》检验范式中的严格筛选标准,为其中95个预测信号生成了“模板报告”。这些模板详细展示了各信号通过多样检验方法实现收益预测的表现,并与200余种已被文献记载的市场异象进行了基准绩效对比。最后,我们针对每个模板,利用最先进的大语言模型生成多个完整的学术论文版本,为观测到的收益可预测性提供差异化的理论依据,并纳入支持各自论点的文献引用。本实验既展现了人工智能在提升金融研究效率方面的潜力,也起到了警示作用:它可能被滥用,将“已知结果后构建假说(Hypothesizing After Results are Known,HARKing)”这一行为产业化。
提供机构:
ICPSR - Interuniversity Consortium for Political and Social Research
创建时间:
2026-02-12



