Randomized controlled clinical trials with tagged information regarding the number of participants
收藏DataONE2025-01-04 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:e0938ad06d55e9c85b6e2f51ca2e7f41b064cf665fddd23e0d3b1921e500ceb9
下载链接
链接失效反馈官方服务:
资源简介:
Background:
Extracting the sample size from randomized controlled trials (RCTs) remains a challenge to developing better search functionalities or automating systematic reviews. Most current approaches rely on the sample size being explicitly mentioned in the abstract.
Data collection:
A random sample of 996 randomized controlled trials (RCTs) from seven major journals (British Medical Journal, JAMA, JAMA Oncology, Journal of Clinical Oncology, Lancet, Lancet Oncology, New England Journal of Medicine) published between 2010 and 2022 were labeled. To do so, abstracts were retrieved as a txt file from PubMed and parsed using regular expressions (i.e., expressions that match certain patterns in text). For each trial, the number of people who were randomized was retrieved by looking at the abstract, followed by the full publication if the number could not be determined with certainty from the abstract. In addition, six different entities were tagged in each abstract, independent of whether ..., , , # Randomized controlled clinical trials with tagged information regarding the number of participants
[https://doi.org/10.5061/dryad.g1jwstr0b](https://doi.org/10.5061/dryad.g1jwstr0b)
A random sample of 996 randomized controlled trials (RCTs) from seven major journals (British Medical Journal, JAMA, JAMA Oncology, Journal of Clinical Oncology, Lancet, Lancet Oncology, New England Journal of Medicine) published between 2010 and 2022 were labeled. To do so, abstracts were retrieved as a txt file from PubMed and parsed using regular expressions (i.e., expressions that match certain patterns in text).Â
For each trial, the number of people who were randomized was retrieved by looking at the abstract, followed by the full publication if the number could not be determined with certainty from the abstract.
In addition, six different entities were tagged in each abstract, independent of whether the information was presented using words or integers. If the number of people who were randomized ...
创建时间:
2025-01-05



