five

U4RASD/Masrad

收藏
Hugging Face2026-03-16 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/U4RASD/Masrad
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: semantic dtype: float64 - name: semantic_rank dtype: float64 - name: semantic_diff dtype: float64 - name: lexical1 dtype: float64 - name: lexical1_rank dtype: float64 - name: lexical1_diff dtype: float64 - name: lexical2 dtype: float64 - name: lexical2_rank dtype: float64 - name: lexical2_diff dtype: float64 - name: entity dtype: string - name: src_entity dtype: string - name: phonetic dtype: bool - name: pos dtype: string - name: label dtype: bool splits: - name: train num_examples: 19405 configs: - config_name: default data_files: - split: train path: data/train.csv license: mit task_categories: - text-classification language: - ar size_categories: - 10K<n<100K --- # Masrad Dataset ## Dataset Description The **Masrad** dataset contains 19,405 examples with features related to semantic similarity, lexical similarity, entity recognition, phonetic matching, and part-of-speech tagging for Arabic text. ## Features | Feature | Type | Description | |---------|------|-------------| | `semantic` | float | Semantic similarity score | | `semantic_rank` | float | Rank based on semantic similarity | | `semantic_diff` | float | Difference from top semantic score | | `lexical1` | float | First lexical similarity score | | `lexical1_rank` | float | Rank based on first lexical score | | `lexical1_diff` | float | Difference from top first lexical score | | `lexical2` | float | Second lexical similarity score | | `lexical2_rank` | float | Rank based on second lexical score | | `lexical2_diff` | float | Difference from top second lexical score | | `entity` | string | Detected entity type (PER, ORG, LOC, none) | | `src_entity` | string | Source entity type | | `phonetic` | bool | Whether phonetic match exists | | `pos` | string | Part of speech tag | | `label` | bool | Target label | ## Usage ```python from datasets import load_dataset dataset = load_dataset("U4RASD/Masrad") ```
提供机构:
U4RASD
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作