Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting

Figshare2016-01-15 更新2026-04-29 收录

下载链接：

https://figshare.com/articles/dataset/_Multi_Reader_Multi_Case_Studies_Using_the_Area_under_the_Receiver_Operator_Characteristic_Curve_as_a_Measure_of_Diagnostic_Accuracy_Systematic_Review_with_a_Focus_on_Quality_of_Data_Reporting_/1279565

下载链接

链接失效反馈

官方服务：

资源简介：

IntroductionWe examined the design, analysis and reporting in multi-reader multi-case (MRMC) research studies using the area under the receiver-operating curve (ROC AUC) as a measure of diagnostic performance.MethodsWe performed a systematic literature review from 2005 to 2013 inclusive to identify a minimum 50 studies. Articles of diagnostic test accuracy in humans were identified via their citation of key methodological articles dealing with MRMC ROC AUC. Two researchers in consensus then extracted information from primary articles relating to study characteristics and design, methods for reporting study outcomes, model fitting, model assumptions, presentation of results, and interpretation of findings. Results were summarized and presented with a descriptive analysis.ResultsSixty-four full papers were retrieved from 475 identified citations and ultimately 49 articles describing 51 studies were reviewed and extracted. Radiological imaging was the index test in all. Most studies focused on lesion detection vs. characterization and used less than 10 readers. Only 6 (12%) studies trained readers in advance to use the confidence scale used to build the ROC curve. Overall, description of confidence scores, the ROC curve and its analysis was often incomplete. For example, 21 (41%) studies presented no ROC curve and only 3 (6%) described the distribution of confidence scores. Of 30 studies presenting curves, only 4 (13%) presented the data points underlying the curve, thereby allowing assessment of extrapolation. The mean change in AUC was 0.05 (−0.05 to 0.28). Non-significant change in AUC was attributed to underpowering rather than the diagnostic test failing to improve diagnostic accuracy.ConclusionsData reporting in MRMC studies using ROC AUC as an outcome measure is frequently incomplete, hampering understanding of methods and the reliability of results and study conclusions. Authors using this analysis should be encouraged to provide a full description of their methods and results.

引言本研究针对以受试者工作特征曲线下面积（ROC AUC）作为诊断性能评价指标的多读者多案例（MRMC）研究，对其设计、分析与报告现状展开了系统性考察。方法我们于2005年至2013年（含首尾年份）开展系统文献综述，目标检索至少50项相关研究。通过筛选引用了MRMC ROC AUC关键方法学文献的文献，最终纳入人体诊断试验准确性相关研究。随后由两名达成一致意见的研究人员从原始文献中提取以下信息：研究特征与设计方案、研究结局报告方法、模型拟合流程、模型假设条件、结果呈现形式及研究结果解读方式。最终对提取的数据进行汇总并开展描述性分析。结果从475篇初步检出的引文中共检索到64篇完整论文，最终纳入49篇涵盖51项研究的文献并完成数据提取。所有研究的待评价试验均为放射影像学检查。多数研究聚焦于病灶检出与病灶特征鉴别，受试读者人数均不足10名。仅6项（12%）研究提前对受试读者进行培训，使其熟悉构建ROC曲线所用的置信评分量表。总体而言，对置信评分、ROC曲线及其分析方法的描述往往不够完整。例如，21项（41%）研究未展示ROC曲线，仅3项（6%）研究说明了置信评分的分布情况。在30项展示了曲线的研究中，仅4项（13%）提供了曲线所依据的数据点，从而可对其外推性进行评估。AUC的平均变化值为0.05（范围-0.05至0.28）。AUC变化无统计学意义的原因被归因于检验效能不足，而非诊断试验未能提升诊断性能。结论以ROC AUC作为结局指标的MRMC研究中，数据报告往往存在缺失，这会阻碍学界对研究方法、结果可靠性及研究结论的理解。应鼓励采用此类分析方法的作者完整报告其研究方法与结果。

创建时间：

2016-01-15