five

Pile-NIH_ExPorter

收藏
魔搭社区2025-03-21 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/OmniData/Pile-NIH_ExPorter
下载链接
链接失效反馈
官方服务:
资源简介:
displayName: Pile-NIH_ExPorter license: - MIT taskTypes: - Natural Language Generation - Language Modelling mediaTypes: - Text labelTypes: - English Corpus tags: [] publisher: - EleutherAI publishDate: '2023-07-18' publishUrl: https://pile.eleuther.ai/ paperUrl: '' --- # 数据介绍 ## 简介 Pile-NIH ExPorter数据集是一个基于NIH ExPorter数据库构建的大规模医学文献集合。该数据集汇集了来自NIH(美国国立卫生研究院)的医学研究项目的摘要和元数据信息,涵盖了从基础科学到临床研究的各种医学领域。 Pile-NIH ExPorter数据集为研究人员和开发者提供了一个丰富的医学研究信息资源。它可以用于医学文本分析、研究趋势分析、科学发现等应用,推动医学领域的研究和创新。 ## 数据内容 ### 数据说明 Pile-NIH ExPorter数据集涵盖了1.7G的数据。 ### 数据示例 ``` { "id": "134603122", "source_id": "", "doc_id": "419370", "data_type": "text", "data_source": "pile", "data_url": "enwiki-c4-pile-ccnews", "content": "Receptors for transplantation antigens may be visualized directly on unsensitized cells with the use of an anti-idiotypic antisera. It should be possible to eliminate these cells specifically by treatment of the unsensitized population with sera directed at the specific binding site. Sera have been raised to cells cytotoxic for transplantation alloantigen in strain combinations selected so that only the variable portion of the antigen specific receptor could be recognized. Bona fide anti-idiotypic sera have not been raised. Refinements in the approach under way include 1) use of cytotoxic populations as immunogen with a demonstrably high proportion of receptor-bearing cells, 2) use of parent strains differing only by a point mutation to restrict the heterogeneity of the immune response of one against the other, and 3) use of continuously cultured cytotoxic cell lines to increase the homogeneity of the receptors used as immunogen.\n", "remark": { "pile_set_name": "NIH ExPorter" }, "sub_path": "nih-exporter/test" } ``` ## 引文 ``` @misc{conghui2022opendatalab, title={OpenDataLab: Empowering General Artificial Intelligence with Open Datasets}, author={Conghui He, Wei Li, Zhenjiang Jin, Bin Wang, Chao Xu, Dahua Lin}, journal={https://opendatalab.com/}, year={2022} } ``` ## Download dataset :modelscope-code[]{type="git"}

displayName: Pile-NIH_ExPorter license: - MIT许可证 taskTypes: - 自然语言生成 - 语言建模 mediaTypes: - 文本 labelTypes: - 英语语料库 tags: [] publisher: - EleutherAI publishDate: '2023-07-18' publishUrl: https://pile.eleuther.ai/ paperUrl: '' --- # 数据集介绍 ## 简介 Pile-NIH ExPorter数据集是基于NIH(美国国立卫生研究院,National Institutes of Health)ExPorter数据库构建的大规模医学文献集合。该数据集汇聚了来自NIH的医学研究项目摘要与元数据信息,覆盖基础科学至临床研究等全领域医学研究范畴。 Pile-NIH ExPorter数据集为研究者与开发者提供了丰富的医学研究信息资源,可应用于医学文本分析、研究趋势挖掘、科学发现等场景,助力医学领域的研究创新与发展。 ## 数据内容 ### 数据说明 Pile-NIH ExPorter数据集规模达1.7 GB。 ### 数据示例 { "id": "134603122", "source_id": "", "doc_id": "419370", "data_type": "text", "data_source": "pile", "data_url": "enwiki-c4-pile-ccnews", "content": "Receptors for transplantation antigens may be visualized directly on unsensitized cells with the use of an anti-idiotypic antisera. It should be possible to eliminate these cells specifically by treatment of the unsensitized population with sera directed at the specific binding site. Sera have been raised to cells cytotoxic for transplantation alloantigen in strain combinations selected so that only the variable portion of the antigen specific receptor could be recognized. Bona fide anti-idiotypic sera have not been raised. Refinements in the approach under way include 1) use of cytotoxic populations as immunogen with a demonstrably high proportion of receptor-bearing cells, 2) use of parent strains differing only by a point mutation to restrict the heterogeneity of the immune response of one against the other, and 3) use of continuously cultured cytotoxic cell lines to increase the homogeneity of the receptors used as immunogen. ", "remark": { "pile_set_name": "NIH ExPorter" }, "sub_path": "nih-exporter/test" } ## 参考文献 @misc{conghui2022opendatalab, title={OpenDataLab: Empowering General Artificial Intelligence with Open Datasets}, author={Conghui He, Wei Li, Zhenjiang Jin, Bin Wang, Chao Xu, Dahua Lin}, journal={https://opendatalab.com/}, year={2022} } ## 数据集下载 :modelscope-code[]{type="git"}
提供机构:
maas
创建时间:
2024-07-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作