five

Supplement to "Have Large Language Models Enhanced the Way Civil & Environmental Engineers Write? A Quantitative Analysis of Scholarly Communication over 25 Years"

收藏
DataCite Commons2026-01-20 更新2026-04-25 收录
下载链接:
https://www.designsafe-ci.org/data/browser/public/designsafe.storage.published/PRJ-6251
下载链接
链接失效反馈
官方服务:
资源简介:
This file presents the Supplemental Materials to the forthcoming publication "Have Large Language Models Enhanced the Way Civil & Environmental Engineers Write? A Quantitative Analysis of Scholarly Communication over 25 Years" under consideration for publication. Abstract: Large language models (LLMs) have rapidly emerged in civil and environmental engineering (CEE) research, education, and practice as a tool for project ideation, execution, and communication. However, it is unknown how prevalent LLM adoption is across CEE scholarship and whether it meaningfully alters research prose. Inspired by a recent analysis of biomedical abstracts, this study adapts a vocabulary-based frequency-shift methodology to estimate the incidence of LLM-written abstracts in the field of CEE scholarship using 149,452 abstracts published by the American Society of Civil Engineers between 2000 and 2025 as the representative corpus. By quantifying departures from recent vocabulary trends, we estimate 15.3% and 26.2% of abstracts published in 2024 and 2025, respectively, were written my LLMs, with estimates as high as 38.4% in specific domains of CEE specialization. Prior to the introduction of LLMs in 2022, CEE publications exhibit long-term trends toward increasing numbers of authors, longer abstracts and sentences, greater use of segmenting punctuation, higher required reading levels, and a shift toward active, first-person verb constructions. Beginning around 2023, however, the frequencies of many excess style words (e.g., enhance, offer, demonstrate) dramatically depart from their historic trajectories, and correspondingly, departures in multiple semantic properties are observed. When abstracts classified as likely LLM-written are isolated, these departures are shown to be largely attributable to LLM-generated text. These abstracts exhibit systematic shifts, including increased word choice diversity, more commas, increased complexity, decreased use of passive constructions, and less qualifying language commonly used to convey uncertainty, such that prose is generally more segmented, syntactically complex, and assertive. Together, these findings provide the first large-scale, data-driven assessment of LLM use and effect on CEE scholarly writing.
提供机构:
Designsafe-CI
创建时间:
2026-01-20
二维码
社区交流群
二维码
科研交流群
商业服务