mlsa-iai-msu-lab/ru_sci_bench_mteb
收藏Hugging Face2024-08-29 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mlsa-iai-msu-lab/ru_sci_bench_mteb
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个配置,每个配置对应不同的任务和语言(如英语和俄语)。数据集的特征包括论文ID(paper_id)、文本内容(text)以及相应的标签或值(label/value)。数据集被分为训练集和测试集,每个分割都有具体的字节数和示例数。数据集的任务可能涉及论文引用计数、核心风险分类、GRNTI分类、OECD分类、出版物类型分类和出版年份等。
This dataset contains multiple configurations, each corresponding to different tasks and languages (e.g., English and Russian). The features of the dataset include paper ID (paper_id), text content (text), and corresponding labels or values (label/value). The dataset is divided into training and test sets, with specific byte sizes and example counts for each split. The tasks of the dataset may involve paper citation counting, core risk classification, GRNTI classification, OECD classification, publication type classification, and publication year.
提供机构:
mlsa-iai-msu-lab



