five

nlpai-lab/ko_commongen_v2_code_switching

收藏
Hugging Face2024-08-11 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/nlpai-lab/ko_commongen_v2_code_switching
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: concept_set dtype: string - name: '1' dtype: string - name: '2' dtype: string - name: '3' dtype: string - name: '4' dtype: string - name: gold dtype: int64 splits: - name: china num_bytes: 25567 num_examples: 99 - name: english num_bytes: 33124 num_examples: 99 - name: espanol num_bytes: 35535 num_examples: 99 - name: japan num_bytes: 33329 num_examples: 99 - name: korean num_bytes: 33419 num_examples: 99 download_size: 131071 dataset_size: 160974 configs: - config_name: default data_files: - split: china path: data/china-* - split: english path: data/english-* - split: espanol path: data/espanol-* - split: japan path: data/japan-* - split: korean path: data/korean-* --- ### 🇰🇷🇺🇸🇯🇵🇨🇳🇪🇸 KoCommonGEN v2 Code-switching This KoCommonGEN v2 Code-switching dataset consists of 99 samples for numerical commonsense reasoning, which were created relying on machine translation. The dataset can be found on Hugging Face at: [nlpai-lab/ko_commongen_v2_code_switching](https://huggingface.co/datasets/nlpai-lab/ko_commongen_v2_code_switching) This dataset contains code-switching data for the following languages: - Korean (korean) - English (english) - Japanese (japan) - Chinese (china) - Spanish (espanol) (The code-switching data relies on machine translation, which may result in some inaccuracies.) To load the dataset, you can use the following code: ```python from datasets import load_dataset dataset = load_dataset("nlpai-lab/ko_commongen_v2_code_switching") # To access a specific language dataset: korean_data = dataset['korean'] english_data = dataset['english'] # ... and so on for other languages ```
提供机构:
nlpai-lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作