five

macavaney/cord19.pisa

收藏
Hugging Face2024-04-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/macavaney/cord19.pisa
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: PISA Index for CORD-19 tags: - pyterrier task_categories: - text-retrieval source_datasets: ['cord19'] viewer: false --- # PISA Index for CORD-19 ## Using this Artifact ```python import pyterrier as pt pt.Artifact.from_url('hf:macavaney/cord19.pisa') # PisaIndex(...) ``` ## Reproducing this Artifact ```bash pip install requirements.txt python build.py path/to/artifact ``` <details> <summary>Artifact Build Log</summary> <pre> cord19 documents: 0%| | 0/192509 [00:00<?, ?it/s]/home/sean/miniconda3/lib/python3.9/site-packages/pyterrier_pisa/__init__.py:144: UserWarning: text_field not specified; indexing all str fields: ['abstract', 'date', 'doi', 'title'] warn(f'text_field not specified; indexing all str fields: {text_field}') cord19 documents: 11%|#1 | 21525/192509 [00:00<00:00, 215242.51it/s] cord19 documents: 23%|##2 | 43554/192509 [00:00<00:00, 218208.64it/s] cord19 documents: 34%|###3 | 65375/192509 [00:00<00:00, 215903.81it/s] cord19 documents: 46%|####5 | 87678/192509 [00:00<00:00, 218697.91it/s] cord19 documents: 59%|#####8 | 113053/192509 [00:00<00:00, 231283.52it/s] cord19 documents: 71%|#######1 | 136963/192509 [00:00<00:00, 233931.86it/s] cord19 documents: 83%|########3 | 160362/192509 [00:00<00:00, 219996.33it/s] cord19 documents: 95%|#########4| 182516/192509 [00:00<00:00, 211965.50it/s] cord19 documents: 100%|##########| 192509/192509 [00:00<00:00, 217227.77it/s] </pre> </details>
提供机构:
macavaney
原始信息汇总

PISA Index for CORD-19

数据集概述

  • 名称: PISA Index for CORD-19
  • 标签: pyterrier
  • 任务类别: 文本检索
  • 源数据集: cord19

使用方法

python import pyterrier as pt pt.Artifact.from_url(hf:macavaney/cord19.pisa)

PisaIndex(...)

重现方法

bash pip install requirements.txt python build.py path/to/artifact

构建日志摘要

  • 数据集包含192,509篇文档。
  • 构建过程中的进度显示,最终成功构建了整个数据集的索引。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作