five

lilolyhh/OLA

收藏
Hugging Face2026-04-19 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/lilolyhh/OLA
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en - ko license: cc-by-nc-nd-4.0 task_categories: - text-generation pretty_name: OLA (Output Language Alignment Benchmark) configs: - config_name: simple_en_matrix data_files: simple/EN_Matrix_KO_Content.csv - config_name: simple_ko_matrix data_files: simple/KO_Matrix_EN_Content.csv - config_name: complex_en_instr data_files: complex/EN_Instruction_KO_Content.csv - config_name: complex_ko_instr data_files: complex/KO_Instruction_EN_Content.csv --- # OLA: Output Language Alignment Benchmark OLA is a benchmark designed to evaluate LLMs' Output Language Alignment in code-switched interactions ## Dataset Structure OLA consists of two settings: **Simple** and **Complex**. ### Simple Setting The Simple setting focuses on intra-sentential code-switching, where the expected response language is the *matrix language*—the language providing the core grammatical structure into which elements from another language are embedded. Located in `simple/`: - **EN_Matrix_KO_Content.csv**: English matrix language with Korean embedded content - **KO_Matrix_EN_Content.csv**: Korean matrix language with English embedded content Each file contains three columns: - `query`: The code-switched prompt - `source`: Source identifier - `expected_lang`: Expected response language ### Complex Setting The Complex setting involves inter-sentential code-switching, where instruction and content languages differ, and the correct response language must be inferred from task semantics. Located in `complex/`: - **EN_Instruction_KO_Content.csv**: English instruction with Korean content - **KO_Instruction_EN_Content.csv**: Korean instruction with English content Each file contains four columns: - `InstrFirst`: Instruction-first query - `ContentFirst`: Content-first query - `source`: Source identifier - `expected_lang`: Expected response language
提供机构:
lilolyhh
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作