five

skatzR/CompositeNLP-RuEn

收藏
Hugging Face2025-09-26 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/skatzR/CompositeNLP-RuEn
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个多语言复合数据集,专为检索和语义搜索/NLI任务设计。它包含俄语和英语示例,并包括检索或语义任务的标签。该数据集已进行分词和截断,以确保与基于转换器的模型的兼容性。数据集结构旨在支持检索和语义任务,关键字段包括数据集名称、语言、任务类型、分割和数据类型。该数据集主要供非商业研究使用,用户必须遵守所有源数据集的许可证,并提供归属。该数据集在俄语方面不平衡,并且可能存在文化/语言偏差。

This is a multilingual composite dataset designed for retrieval and semantic search/NLI tasks. It contains examples in Russian and English and includes task labels for retrieval or semantic tasks. The dataset has been tokenized and truncated to ensure compatibility with transformer-based models. The dataset structure is designed for both retrieval and semantic tasks, with key fields including dataset name, language, task type, split, and data type. The dataset is primarily for non-commercial research purposes, and users are required to comply with the licenses of all source datasets and provide attribution. The dataset is imbalanced towards Russian and may have cultural/language biases.
提供机构:
skatzR
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作