lilolyhh/OLA

Name: lilolyhh/OLA
Creator: lilolyhh
Published: 2026-04-19 13:15:57
License: 暂无描述

Hugging Face2026-04-19 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/lilolyhh/OLA

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en - ko license: cc-by-nc-nd-4.0 task_categories: - text-generation pretty_name: OLA (Output Language Alignment Benchmark) configs: - config_name: simple_en_matrix data_files: simple/EN_Matrix_KO_Content.csv - config_name: simple_ko_matrix data_files: simple/KO_Matrix_EN_Content.csv - config_name: complex_en_instr data_files: complex/EN_Instruction_KO_Content.csv - config_name: complex_ko_instr data_files: complex/KO_Instruction_EN_Content.csv --- # OLA: Output Language Alignment Benchmark OLA is a benchmark designed to evaluate LLMs' Output Language Alignment in code-switched interactions ## Dataset Structure OLA consists of two settings: **Simple** and **Complex**. ### Simple Setting The Simple setting focuses on intra-sentential code-switching, where the expected response language is the *matrix language*—the language providing the core grammatical structure into which elements from another language are embedded. Located in `simple/`: - **EN_Matrix_KO_Content.csv**: English matrix language with Korean embedded content - **KO_Matrix_EN_Content.csv**: Korean matrix language with English embedded content Each file contains three columns: - `query`: The code-switched prompt - `source`: Source identifier - `expected_lang`: Expected response language ### Complex Setting The Complex setting involves inter-sentential code-switching, where instruction and content languages differ, and the correct response language must be inferred from task semantics. Located in `complex/`: - **EN_Instruction_KO_Content.csv**: English instruction with Korean content - **KO_Instruction_EN_Content.csv**: Korean instruction with English content Each file contains four columns: - `InstrFirst`: Instruction-first query - `ContentFirst`: Content-first query - `source`: Source identifier - `expected_lang`: Expected response language

提供机构：

lilolyhh

5,000+

优质数据集

54 个

任务类型

进入经典数据集