wissamantoun/orca_parquet
收藏Hugging Face2026-03-01 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/wissamantoun/orca_parquet
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: ORCA – Arabic Language Understanding Benchmark
language:
- ar
configs:
- config_name: abusive
data_files:
- split: train
path: data/abusive/train.parquet
- split: validation
path: data/abusive/validation.parquet
- split: test
path: data/abusive/test.parquet
- config_name: adult
data_files:
- split: train
path: data/adult/train.parquet
- split: validation
path: data/adult/validation.parquet
- split: test
path: data/adult/test.parquet
- config_name: age
data_files:
- split: train
path: data/age/train.parquet
- split: validation
path: data/age/validation.parquet
- split: test
path: data/age/test.parquet
- config_name: ans-claim
data_files:
- split: train
path: data/ans-claim/train.parquet
- split: validation
path: data/ans-claim/validation.parquet
- split: test
path: data/ans-claim/test.parquet
- config_name: ans-stance
data_files:
- split: train
path: data/ans-stance/train.parquet
- split: validation
path: data/ans-stance/validation.parquet
- split: test
path: data/ans-stance/test.parquet
- config_name: aqmar-ner
data_files:
- split: train
path: data/aqmar-ner/train.parquet
- split: validation
path: data/aqmar-ner/validation.parquet
- split: test
path: data/aqmar-ner/test.parquet
- config_name: arabic-ner
data_files:
- split: train
path: data/arabic-ner/train.parquet
- split: validation
path: data/arabic-ner/validation.parquet
- split: test
path: data/arabic-ner/test.parquet
- config_name: baly-stance
data_files:
- split: train
path: data/baly-stance/train.parquet
- split: validation
path: data/baly-stance/validation.parquet
- split: test
path: data/baly-stance/test.parquet
- config_name: dangerous
data_files:
- split: train
path: data/dangerous/train.parquet
- split: validation
path: data/dangerous/validation.parquet
- split: test
path: data/dangerous/test.parquet
- config_name: dialect-binary
data_files:
- split: train
path: data/dialect-binary/train.parquet
- split: validation
path: data/dialect-binary/validation.parquet
- split: test
path: data/dialect-binary/test.parquet
- config_name: dialect-country
data_files:
- split: train
path: data/dialect-country/train.parquet
- split: validation
path: data/dialect-country/validation.parquet
- split: test
path: data/dialect-country/test.parquet
- config_name: dialect-pos
data_files:
- split: train
path: data/dialect-pos/train.parquet
- split: validation
path: data/dialect-pos/validation.parquet
- split: test
path: data/dialect-pos/test.parquet
- config_name: dialect-region
data_files:
- split: train
path: data/dialect-region/train.parquet
- split: validation
path: data/dialect-region/validation.parquet
- split: test
path: data/dialect-region/test.parquet
- config_name: emotion
data_files:
- split: train
path: data/emotion/train.parquet
- split: validation
path: data/emotion/validation.parquet
- split: test
path: data/emotion/test.parquet
- config_name: emotion-reg
data_files:
- split: train
path: data/emotion-reg/train.parquet
- split: validation
path: data/emotion-reg/validation.parquet
- split: test
path: data/emotion-reg/test.parquet
- config_name: gender
data_files:
- split: train
path: data/gender/train.parquet
- split: validation
path: data/gender/validation.parquet
- split: test
path: data/gender/test.parquet
- config_name: hate-speech
data_files:
- split: train
path: data/hate-speech/train.parquet
- split: validation
path: data/hate-speech/validation.parquet
- split: test
path: data/hate-speech/test.parquet
- config_name: irony
data_files:
- split: train
path: data/irony/train.parquet
- split: validation
path: data/irony/validation.parquet
- split: test
path: data/irony/test.parquet
- config_name: machine-generation
data_files:
- split: train
path: data/machine-generation/train.parquet
- split: validation
path: data/machine-generation/validation.parquet
- split: test
path: data/machine-generation/test.parquet
- config_name: mq2q
data_files:
- split: train
path: data/mq2q/train.parquet
- split: validation
path: data/mq2q/validation.parquet
- split: test
path: data/mq2q/test.parquet
- config_name: msa-pos
data_files:
- split: train
path: data/msa-pos/train.parquet
- split: validation
path: data/msa-pos/validation.parquet
- split: test
path: data/msa-pos/test.parquet
- config_name: offensive
data_files:
- split: train
path: data/offensive/train.parquet
- split: validation
path: data/offensive/validation.parquet
- split: test
path: data/offensive/test.parquet
- config_name: qa
data_files:
- split: train
path: data/qa/train.parquet
- split: validation
path: data/qa/validation.parquet
- split: test
path: data/qa/test.parquet
- config_name: sarcasm
data_files:
- split: train
path: data/sarcasm/train.parquet
- split: validation
path: data/sarcasm/validation.parquet
- split: test
path: data/sarcasm/test.parquet
- config_name: sentiment
data_files:
- split: train
path: data/sentiment/train.parquet
- split: validation
path: data/sentiment/validation.parquet
- split: test
path: data/sentiment/test.parquet
- config_name: sts
data_files:
- split: train
path: data/sts/train.parquet
- split: validation
path: data/sts/validation.parquet
- split: test
path: data/sts/test.parquet
- config_name: topic
data_files:
- split: train
path: data/topic/train.parquet
- split: validation
path: data/topic/validation.parquet
- split: test
path: data/topic/test.parquet
- config_name: wsd
data_files:
- split: train
path: data/wsd/train.parquet
- split: validation
path: data/wsd/validation.parquet
- split: test
path: data/wsd/test.parquet
- config_name: xlni
data_files:
- split: train
path: data/xlni/train.parquet
- split: validation
path: data/xlni/validation.parquet
- split: test
path: data/xlni/test.parquet
---
# ORCA – Arabic Language Understanding Benchmark
Converted from the original [`wissamantoun/orca_hf`](https://huggingface.co/datasets/wissamantoun/orca_hf) dataset script to parquet files.
## Usage
```python
from datasets import load_dataset
# Load a specific config
ds = load_dataset("PATH/orca_parquet", name="sentiment")
# Load all configs
ds = load_dataset("PATH/orca_parquet")
```
提供机构:
wissamantoun



