five

cis-lmu/Taxi1500-RawData

收藏
Hugging Face2024-06-05 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/cis-lmu/Taxi1500-RawData
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: aai_Latn data_files: - split: taxi1500 path: aai_Latn/taxi1500/*.arrow - config_name: aak_Latn data_files: - split: taxi1500 path: aak_Latn/taxi1500/*.arrow - config_name: aau_Latn data_files: - split: taxi1500 path: aau_Latn/taxi1500/*.arrow - config_name: aaz_Latn data_files: - split: taxi1500 path: aaz_Latn/taxi1500/*.arrow - config_name: abt_Latn data_files: - split: taxi1500 path: abt_Latn/taxi1500/*.arrow - config_name: abx_Latn data_files: - split: taxi1500 path: abx_Latn/taxi1500/*.arrow - config_name: aby_Latn data_files: - split: taxi1500 path: aby_Latn/taxi1500/*.arrow - config_name: acf_Latn data_files: - split: taxi1500 path: acf_Latn/taxi1500/*.arrow - config_name: acr_Latn data_files: - split: taxi1500 path: acr_Latn/taxi1500/*.arrow - config_name: acu_Latn data_files: - split: taxi1500 path: acu_Latn/taxi1500/*.arrow - config_name: adt_Latn data_files: - split: taxi1500 path: adt_Latn/taxi1500/*.arrow - config_name: adz_Latn data_files: - split: taxi1500 path: adz_Latn/taxi1500/*.arrow - config_name: aer_Latn data_files: - split: taxi1500 path: aer_Latn/taxi1500/*.arrow - config_name: aey_Latn data_files: - split: taxi1500 path: aey_Latn/taxi1500/*.arrow - config_name: agd_Latn data_files: - split: taxi1500 path: agd_Latn/taxi1500/*.arrow - config_name: agg_Latn data_files: - split: taxi1500 path: agg_Latn/taxi1500/*.arrow - config_name: agm_Latn data_files: - split: taxi1500 path: agm_Latn/taxi1500/*.arrow - config_name: agn_Latn data_files: - split: taxi1500 path: agn_Latn/taxi1500/*.arrow - config_name: agr_Latn data_files: - split: taxi1500 path: agr_Latn/taxi1500/*.arrow - config_name: agt_Latn data_files: - split: taxi1500 path: agt_Latn/taxi1500/*.arrow - config_name: agu_Latn data_files: - split: taxi1500 path: agu_Latn/taxi1500/*.arrow - config_name: ahr_Deva data_files: - split: taxi1500 path: ahr_Deva/taxi1500/*.arrow - config_name: aia_Latn data_files: - split: taxi1500 path: aia_Latn/taxi1500/*.arrow - config_name: aii_Syrc data_files: - split: taxi1500 path: aii_Syrc/taxi1500/*.arrow - config_name: aka_Latn data_files: - split: taxi1500 path: aka_Latn/taxi1500/*.arrow - config_name: ake_Latn data_files: - split: taxi1500 path: ake_Latn/taxi1500/*.arrow - config_name: akh_Latn data_files: - split: taxi1500 path: akh_Latn/taxi1500/*.arrow - config_name: aln_Latn data_files: - split: taxi1500 path: aln_Latn/taxi1500/*.arrow - config_name: alp_Latn data_files: - split: taxi1500 path: alp_Latn/taxi1500/*.arrow - config_name: alq_Latn data_files: - split: taxi1500 path: alq_Latn/taxi1500/*.arrow - config_name: als_Latn data_files: - split: taxi1500 path: als_Latn/taxi1500/*.arrow - config_name: aly_Latn data_files: - split: taxi1500 path: aly_Latn/taxi1500/*.arrow - config_name: ame_Latn data_files: - split: taxi1500 path: ame_Latn/taxi1500/*.arrow - config_name: amf_Latn data_files: - split: taxi1500 path: amf_Latn/taxi1500/*.arrow - config_name: amk_Latn data_files: - split: taxi1500 path: amk_Latn/taxi1500/*.arrow - config_name: amm_Latn data_files: - split: taxi1500 path: amm_Latn/taxi1500/*.arrow - config_name: amn_Latn data_files: - split: taxi1500 path: amn_Latn/taxi1500/*.arrow - config_name: amo_Latn data_files: - split: taxi1500 path: amo_Latn/taxi1500/*.arrow - config_name: amp_Latn data_files: - split: taxi1500 path: amp_Latn/taxi1500/*.arrow - config_name: amr_Latn data_files: - split: taxi1500 path: amr_Latn/taxi1500/*.arrow - config_name: amu_Latn data_files: - split: taxi1500 path: amu_Latn/taxi1500/*.arrow - config_name: amx_Latn data_files: - split: taxi1500 path: amx_Latn/taxi1500/*.arrow - config_name: anh_Latn data_files: - split: taxi1500 path: anh_Latn/taxi1500/*.arrow - config_name: anv_Latn data_files: - split: taxi1500 path: anv_Latn/taxi1500/*.arrow - config_name: aoi_Latn data_files: - split: taxi1500 path: aoi_Latn/taxi1500/*.arrow - config_name: aoj_Latn data_files: - split: taxi1500 path: aoj_Latn/taxi1500/*.arrow - config_name: aom_Latn data_files: - split: taxi1500 path: aom_Latn/taxi1500/*.arrow - config_name: aon_Latn data_files: - split: taxi1500 path: aon_Latn/taxi1500/*.arrow - config_name: apb_Latn data_files: - split: taxi1500 path: apb_Latn/taxi1500/*.arrow - config_name: ape_Latn data_files: - split: taxi1500 path: ape_Latn/taxi1500/*.arrow - config_name: apn_Latn data_files: - split: taxi1500 path: apn_Latn/taxi1500/*.arrow - config_name: apr_Latn data_files: - split: taxi1500 path: apr_Latn/taxi1500/*.arrow - config_name: apu_Latn data_files: - split: taxi1500 path: apu_Latn/taxi1500/*.arrow - config_name: apw_Latn data_files: - split: taxi1500 path: apw_Latn/taxi1500/*.arrow - config_name: apy_Latn data_files: - split: taxi1500 path: apy_Latn/taxi1500/*.arrow - config_name: apz_Latn data_files: - split: taxi1500 path: apz_Latn/taxi1500/*.arrow - config_name: arb_Arab data_files: - split: taxi1500 path: arb_Arab/taxi1500/*.arrow - config_name: are_Latn data_files: - split: taxi1500 path: are_Latn/taxi1500/*.arrow - config_name: arl_Latn data_files: - split: taxi1500 path: arl_Latn/taxi1500/*.arrow - config_name: arn_Latn data_files: - split: taxi1500 path: arn_Latn/taxi1500/*.arrow - config_name: arp_Latn data_files: - split: taxi1500 path: arp_Latn/taxi1500/*.arrow - config_name: arz_Arab data_files: - split: taxi1500 path: arz_Arab/taxi1500/*.arrow - config_name: asm_Beng data_files: - split: taxi1500 path: asm_Beng/taxi1500/*.arrow - config_name: aso_Latn data_files: - split: taxi1500 path: aso_Latn/taxi1500/*.arrow - config_name: ata_Latn data_files: - split: taxi1500 path: ata_Latn/taxi1500/*.arrow - config_name: atb_Latn data_files: - split: taxi1500 path: atb_Latn/taxi1500/*.arrow - config_name: atd_Latn data_files: - split: taxi1500 path: atd_Latn/taxi1500/*.arrow - config_name: atg_Latn data_files: - split: taxi1500 path: atg_Latn/taxi1500/*.arrow - config_name: att_Latn data_files: - split: taxi1500 path: att_Latn/taxi1500/*.arrow - config_name: auc_Latn data_files: - split: taxi1500 path: auc_Latn/taxi1500/*.arrow - config_name: aui_Latn data_files: - split: taxi1500 path: aui_Latn/taxi1500/*.arrow - config_name: auy_Latn data_files: - split: taxi1500 path: auy_Latn/taxi1500/*.arrow - config_name: avt_Latn data_files: - split: taxi1500 path: avt_Latn/taxi1500/*.arrow - config_name: awb_Latn data_files: - split: taxi1500 path: awb_Latn/taxi1500/*.arrow - config_name: awk_Latn data_files: - split: taxi1500 path: awk_Latn/taxi1500/*.arrow - config_name: awx_Latn data_files: - split: taxi1500 path: awx_Latn/taxi1500/*.arrow - config_name: azb_Latn data_files: - split: taxi1500 path: azb_Latn/taxi1500/*.arrow - config_name: aze_Latn data_files: - split: taxi1500 path: aze_Latn/taxi1500/*.arrow - config_name: azg_Latn data_files: - split: taxi1500 path: azg_Latn/taxi1500/*.arrow - config_name: azz_Latn data_files: - split: taxi1500 path: azz_Latn/taxi1500/*.arrow - config_name: bao_Latn data_files: - split: taxi1500 path: bao_Latn/taxi1500/*.arrow - config_name: bba_Latn data_files: - split: taxi1500 path: bba_Latn/taxi1500/*.arrow - config_name: bbb_Latn data_files: - split: taxi1500 path: bbb_Latn/taxi1500/*.arrow - config_name: bbr_Latn data_files: - split: taxi1500 path: bbr_Latn/taxi1500/*.arrow - config_name: bch_Latn data_files: - split: taxi1500 path: bch_Latn/taxi1500/*.arrow - config_name: bco_Latn data_files: - split: taxi1500 path: bco_Latn/taxi1500/*.arrow - config_name: bdd_Latn data_files: - split: taxi1500 path: bdd_Latn/taxi1500/*.arrow - config_name: bdv_Orya data_files: - split: taxi1500 path: bdv_Orya/taxi1500/*.arrow - config_name: bea_Latn data_files: - split: taxi1500 path: bea_Latn/taxi1500/*.arrow - config_name: bef_Latn data_files: - split: taxi1500 path: bef_Latn/taxi1500/*.arrow - config_name: ben_Beng data_files: - split: taxi1500 path: ben_Beng/taxi1500/*.arrow - config_name: beo_Latn data_files: - split: taxi1500 path: beo_Latn/taxi1500/*.arrow - config_name: beu_Latn data_files: - split: taxi1500 path: beu_Latn/taxi1500/*.arrow - config_name: bfz_Deva data_files: - split: taxi1500 path: bfz_Deva/taxi1500/*.arrow - config_name: bgc_Deva data_files: - split: taxi1500 path: bgc_Deva/taxi1500/*.arrow - config_name: bgg_Latn data_files: - split: taxi1500 path: bgg_Latn/taxi1500/*.arrow - config_name: bgs_Latn data_files: - split: taxi1500 path: bgs_Latn/taxi1500/*.arrow - config_name: bgt_Latn data_files: - split: taxi1500 path: bgt_Latn/taxi1500/*.arrow - config_name: bhd_Deva data_files: - split: taxi1500 path: bhd_Deva/taxi1500/*.arrow - config_name: bhg_Latn data_files: - split: taxi1500 path: bhg_Latn/taxi1500/*.arrow - config_name: bhl_Latn data_files: - split: taxi1500 path: bhl_Latn/taxi1500/*.arrow - config_name: bht_Deva data_files: - split: taxi1500 path: bht_Deva/taxi1500/*.arrow - config_name: big_Latn data_files: - split: taxi1500 path: big_Latn/taxi1500/*.arrow - config_name: bjk_Latn data_files: - split: taxi1500 path: bjk_Latn/taxi1500/*.arrow - config_name: bjp_Latn data_files: - split: taxi1500 path: bjp_Latn/taxi1500/*.arrow - config_name: bjr_Latn data_files: - split: taxi1500 path: bjr_Latn/taxi1500/*.arrow - config_name: bjv_Latn data_files: - split: taxi1500 path: bjv_Latn/taxi1500/*.arrow - config_name: bjz_Latn data_files: - split: taxi1500 path: bjz_Latn/taxi1500/*.arrow - config_name: bkd_Latn data_files: - split: taxi1500 path: bkd_Latn/taxi1500/*.arrow - config_name: bki_Latn data_files: - split: taxi1500 path: bki_Latn/taxi1500/*.arrow - config_name: bkq_Latn data_files: - split: taxi1500 path: bkq_Latn/taxi1500/*.arrow - config_name: bkx_Latn data_files: - split: taxi1500 path: bkx_Latn/taxi1500/*.arrow - config_name: bla_Latn data_files: - split: taxi1500 path: bla_Latn/taxi1500/*.arrow - config_name: blw_Latn data_files: - split: taxi1500 path: blw_Latn/taxi1500/*.arrow - config_name: blz_Latn data_files: - split: taxi1500 path: blz_Latn/taxi1500/*.arrow - config_name: bmh_Latn data_files: - split: taxi1500 path: bmh_Latn/taxi1500/*.arrow - config_name: bmk_Latn data_files: - split: taxi1500 path: bmk_Latn/taxi1500/*.arrow - config_name: bmr_Latn data_files: - split: taxi1500 path: bmr_Latn/taxi1500/*.arrow - config_name: bmu_Latn data_files: - split: taxi1500 path: bmu_Latn/taxi1500/*.arrow - config_name: bnp_Latn data_files: - split: taxi1500 path: bnp_Latn/taxi1500/*.arrow - config_name: boa_Latn data_files: - split: taxi1500 path: boa_Latn/taxi1500/*.arrow - config_name: bod_Tibt data_files: - split: taxi1500 path: bod_Tibt/taxi1500/*.arrow - config_name: boj_Latn data_files: - split: taxi1500 path: boj_Latn/taxi1500/*.arrow - config_name: bon_Latn data_files: - split: taxi1500 path: bon_Latn/taxi1500/*.arrow - config_name: box_Latn data_files: - split: taxi1500 path: box_Latn/taxi1500/*.arrow - config_name: bpr_Latn data_files: - split: taxi1500 path: bpr_Latn/taxi1500/*.arrow - config_name: bps_Latn data_files: - split: taxi1500 path: bps_Latn/taxi1500/*.arrow - config_name: bpx_Deva data_files: - split: taxi1500 path: bpx_Deva/taxi1500/*.arrow - config_name: bqc_Latn data_files: - split: taxi1500 path: bqc_Latn/taxi1500/*.arrow - config_name: bqp_Latn data_files: - split: taxi1500 path: bqp_Latn/taxi1500/*.arrow - config_name: bre_Latn data_files: - split: taxi1500 path: bre_Latn/taxi1500/*.arrow - config_name: bsj_Latn data_files: - split: taxi1500 path: bsj_Latn/taxi1500/*.arrow - config_name: bsn_Latn data_files: - split: taxi1500 path: bsn_Latn/taxi1500/*.arrow - config_name: bsp_Latn data_files: - split: taxi1500 path: bsp_Latn/taxi1500/*.arrow - config_name: bss_Latn data_files: - split: taxi1500 path: bss_Latn/taxi1500/*.arrow - config_name: btt_Latn data_files: - split: taxi1500 path: btt_Latn/taxi1500/*.arrow - config_name: buk_Latn data_files: - split: taxi1500 path: buk_Latn/taxi1500/*.arrow - config_name: bus_Latn data_files: - split: taxi1500 path: bus_Latn/taxi1500/*.arrow - config_name: bvd_Latn data_files: - split: taxi1500 path: bvd_Latn/taxi1500/*.arrow - config_name: bvr_Latn data_files: - split: taxi1500 path: bvr_Latn/taxi1500/*.arrow - config_name: bwo_Latn data_files: - split: taxi1500 path: bwo_Latn/taxi1500/*.arrow - config_name: bxh_Latn data_files: - split: taxi1500 path: bxh_Latn/taxi1500/*.arrow - config_name: byr_Latn data_files: - split: taxi1500 path: byr_Latn/taxi1500/*.arrow - config_name: byx_Latn data_files: - split: taxi1500 path: byx_Latn/taxi1500/*.arrow - config_name: bzd_Latn data_files: - split: taxi1500 path: bzd_Latn/taxi1500/*.arrow - config_name: bzh_Latn data_files: - split: taxi1500 path: bzh_Latn/taxi1500/*.arrow - config_name: bzj_Latn data_files: - split: taxi1500 path: bzj_Latn/taxi1500/*.arrow - config_name: caa_Latn data_files: - split: taxi1500 path: caa_Latn/taxi1500/*.arrow - config_name: cab_Latn data_files: - split: taxi1500 path: cab_Latn/taxi1500/*.arrow - config_name: cac_Latn data_files: - split: taxi1500 path: cac_Latn/taxi1500/*.arrow - config_name: caf_Latn data_files: - split: taxi1500 path: caf_Latn/taxi1500/*.arrow - config_name: cak_Latn data_files: - split: taxi1500 path: cak_Latn/taxi1500/*.arrow - config_name: cao_Latn data_files: - split: taxi1500 path: cao_Latn/taxi1500/*.arrow - config_name: cap_Latn data_files: - split: taxi1500 path: cap_Latn/taxi1500/*.arrow - config_name: car_Latn data_files: - split: taxi1500 path: car_Latn/taxi1500/*.arrow - config_name: cav_Latn data_files: - split: taxi1500 path: cav_Latn/taxi1500/*.arrow - config_name: cax_Latn data_files: - split: taxi1500 path: cax_Latn/taxi1500/*.arrow - config_name: cbc_Latn data_files: - split: taxi1500 path: cbc_Latn/taxi1500/*.arrow - config_name: cbi_Latn data_files: - split: taxi1500 path: cbi_Latn/taxi1500/*.arrow - config_name: cbk_Latn data_files: - split: taxi1500 path: cbk_Latn/taxi1500/*.arrow - config_name: cbr_Latn data_files: - split: taxi1500 path: cbr_Latn/taxi1500/*.arrow - config_name: cbs_Latn data_files: - split: taxi1500 path: cbs_Latn/taxi1500/*.arrow - config_name: cbt_Latn data_files: - split: taxi1500 path: cbt_Latn/taxi1500/*.arrow - config_name: cbu_Latn data_files: - split: taxi1500 path: cbu_Latn/taxi1500/*.arrow - config_name: cbv_Latn data_files: - split: taxi1500 path: cbv_Latn/taxi1500/*.arrow - config_name: cco_Latn data_files: - split: taxi1500 path: cco_Latn/taxi1500/*.arrow - config_name: ceb_Latn data_files: - split: taxi1500 path: ceb_Latn/taxi1500/*.arrow - config_name: ceg_Latn data_files: - split: taxi1500 path: ceg_Latn/taxi1500/*.arrow - config_name: cek_Latn data_files: - split: taxi1500 path: cek_Latn/taxi1500/*.arrow - config_name: ces_Latn data_files: - split: taxi1500 path: ces_Latn/taxi1500/*.arrow - config_name: cgc_Latn data_files: - split: taxi1500 path: cgc_Latn/taxi1500/*.arrow - config_name: cha_Latn data_files: - split: taxi1500 path: cha_Latn/taxi1500/*.arrow - config_name: chd_Latn data_files: - split: taxi1500 path: chd_Latn/taxi1500/*.arrow - config_name: chf_Latn data_files: - split: taxi1500 path: chf_Latn/taxi1500/*.arrow - config_name: chk_Latn data_files: - split: taxi1500 path: chk_Latn/taxi1500/*.arrow - config_name: chq_Latn data_files: - split: taxi1500 path: chq_Latn/taxi1500/*.arrow - config_name: chz_Latn data_files: - split: taxi1500 path: chz_Latn/taxi1500/*.arrow - config_name: cjo_Latn data_files: - split: taxi1500 path: cjo_Latn/taxi1500/*.arrow - config_name: cjv_Latn data_files: - split: taxi1500 path: cjv_Latn/taxi1500/*.arrow - config_name: ckb_Arab data_files: - split: taxi1500 path: ckb_Arab/taxi1500/*.arrow - config_name: cle_Latn data_files: - split: taxi1500 path: cle_Latn/taxi1500/*.arrow - config_name: clu_Latn data_files: - split: taxi1500 path: clu_Latn/taxi1500/*.arrow - config_name: cme_Latn data_files: - split: taxi1500 path: cme_Latn/taxi1500/*.arrow - config_name: cmn_Hani data_files: - split: taxi1500 path: cmn_Hani/taxi1500/*.arrow - config_name: cni_Latn data_files: - split: taxi1500 path: cni_Latn/taxi1500/*.arrow - config_name: cnl_Latn data_files: - split: taxi1500 path: cnl_Latn/taxi1500/*.arrow - config_name: cnt_Latn data_files: - split: taxi1500 path: cnt_Latn/taxi1500/*.arrow - config_name: coe_Latn data_files: - split: taxi1500 path: coe_Latn/taxi1500/*.arrow - config_name: cof_Latn data_files: - split: taxi1500 path: cof_Latn/taxi1500/*.arrow - config_name: con_Latn data_files: - split: taxi1500 path: con_Latn/taxi1500/*.arrow - config_name: cop_Copt data_files: - split: taxi1500 path: cop_Copt/taxi1500/*.arrow - config_name: cot_Latn data_files: - split: taxi1500 path: cot_Latn/taxi1500/*.arrow - config_name: cpa_Latn data_files: - split: taxi1500 path: cpa_Latn/taxi1500/*.arrow - config_name: cpb_Latn data_files: - split: taxi1500 path: cpb_Latn/taxi1500/*.arrow - config_name: cpc_Latn data_files: - split: taxi1500 path: cpc_Latn/taxi1500/*.arrow - config_name: cpu_Latn data_files: - split: taxi1500 path: cpu_Latn/taxi1500/*.arrow - config_name: cpy_Latn data_files: - split: taxi1500 path: cpy_Latn/taxi1500/*.arrow - config_name: crn_Latn data_files: - split: taxi1500 path: crn_Latn/taxi1500/*.arrow - config_name: crx_Latn data_files: - split: taxi1500 path: crx_Latn/taxi1500/*.arrow - config_name: cso_Latn data_files: - split: taxi1500 path: cso_Latn/taxi1500/*.arrow - config_name: csy_Latn data_files: - split: taxi1500 path: csy_Latn/taxi1500/*.arrow - config_name: cta_Latn data_files: - split: taxi1500 path: cta_Latn/taxi1500/*.arrow - config_name: cth_Latn data_files: - split: taxi1500 path: cth_Latn/taxi1500/*.arrow - config_name: ctp_Latn data_files: - split: taxi1500 path: ctp_Latn/taxi1500/*.arrow - config_name: ctu_Latn data_files: - split: taxi1500 path: ctu_Latn/taxi1500/*.arrow - config_name: cub_Latn data_files: - split: taxi1500 path: cub_Latn/taxi1500/*.arrow - config_name: cuc_Latn data_files: - split: taxi1500 path: cuc_Latn/taxi1500/*.arrow - config_name: cui_Latn data_files: - split: taxi1500 path: cui_Latn/taxi1500/*.arrow - config_name: cuk_Latn data_files: - split: taxi1500 path: cuk_Latn/taxi1500/*.arrow - config_name: cut_Latn data_files: - split: taxi1500 path: cut_Latn/taxi1500/*.arrow - config_name: cux_Latn data_files: - split: taxi1500 path: cux_Latn/taxi1500/*.arrow - config_name: cwe_Latn data_files: - split: taxi1500 path: cwe_Latn/taxi1500/*.arrow - config_name: cya_Latn data_files: - split: taxi1500 path: cya_Latn/taxi1500/*.arrow - config_name: cym_Latn data_files: - split: taxi1500 path: cym_Latn/taxi1500/*.arrow - config_name: daa_Latn data_files: - split: taxi1500 path: daa_Latn/taxi1500/*.arrow - config_name: dad_Latn data_files: - split: taxi1500 path: dad_Latn/taxi1500/*.arrow - config_name: dah_Latn data_files: - split: taxi1500 path: dah_Latn/taxi1500/*.arrow - config_name: dak_Latn data_files: - split: taxi1500 path: dak_Latn/taxi1500/*.arrow - config_name: dan_Latn data_files: - split: taxi1500 path: dan_Latn/taxi1500/*.arrow - config_name: dao_Latn data_files: - split: taxi1500 path: dao_Latn/taxi1500/*.arrow - config_name: ded_Latn data_files: - split: taxi1500 path: ded_Latn/taxi1500/*.arrow - config_name: deu_Latn data_files: - split: taxi1500 path: deu_Latn/taxi1500/*.arrow - config_name: dgc_Latn data_files: - split: taxi1500 path: dgc_Latn/taxi1500/*.arrow - config_name: dgr_Latn data_files: - split: taxi1500 path: dgr_Latn/taxi1500/*.arrow - config_name: dgz_Latn data_files: - split: taxi1500 path: dgz_Latn/taxi1500/*.arrow - config_name: dhg_Latn data_files: - split: taxi1500 path: dhg_Latn/taxi1500/*.arrow - config_name: dif_Latn data_files: - split: taxi1500 path: dif_Latn/taxi1500/*.arrow - config_name: dik_Latn data_files: - split: taxi1500 path: dik_Latn/taxi1500/*.arrow - config_name: dji_Latn data_files: - split: taxi1500 path: dji_Latn/taxi1500/*.arrow - config_name: djj_Latn data_files: - split: taxi1500 path: djj_Latn/taxi1500/*.arrow - config_name: djk_Latn data_files: - split: taxi1500 path: djk_Latn/taxi1500/*.arrow - config_name: djr_Latn data_files: - split: taxi1500 path: djr_Latn/taxi1500/*.arrow - config_name: dob_Latn data_files: - split: taxi1500 path: dob_Latn/taxi1500/*.arrow - config_name: dop_Latn data_files: - split: taxi1500 path: dop_Latn/taxi1500/*.arrow - config_name: dov_Latn data_files: - split: taxi1500 path: dov_Latn/taxi1500/*.arrow - config_name: dso_Orya data_files: - split: taxi1500 path: dso_Orya/taxi1500/*.arrow - config_name: dwr_Ethi data_files: - split: taxi1500 path: dwr_Ethi/taxi1500/*.arrow - config_name: dwr_Latn data_files: - split: taxi1500 path: dwr_Latn/taxi1500/*.arrow - config_name: dwu_Latn data_files: - split: taxi1500 path: dwu_Latn/taxi1500/*.arrow - config_name: dww_Latn data_files: - split: taxi1500 path: dww_Latn/taxi1500/*.arrow - config_name: dwy_Latn data_files: - split: taxi1500 path: dwy_Latn/taxi1500/*.arrow - config_name: ebk_Latn data_files: - split: taxi1500 path: ebk_Latn/taxi1500/*.arrow - config_name: ekk_Latn data_files: - split: taxi1500 path: ekk_Latn/taxi1500/*.arrow - config_name: eko_Latn data_files: - split: taxi1500 path: eko_Latn/taxi1500/*.arrow - config_name: emi_Latn data_files: - split: taxi1500 path: emi_Latn/taxi1500/*.arrow - config_name: emp_Latn data_files: - split: taxi1500 path: emp_Latn/taxi1500/*.arrow - config_name: ena_Latn data_files: - split: taxi1500 path: ena_Latn/taxi1500/*.arrow - config_name: eng_Latn data_files: - split: taxi1500 path: eng_Latn/taxi1500/*.arrow - config_name: enm_Latn data_files: - split: taxi1500 path: enm_Latn/taxi1500/*.arrow - config_name: enq_Latn data_files: - split: taxi1500 path: enq_Latn/taxi1500/*.arrow - config_name: epo_Latn data_files: - split: taxi1500 path: epo_Latn/taxi1500/*.arrow - config_name: eri_Latn data_files: - split: taxi1500 path: eri_Latn/taxi1500/*.arrow - config_name: ese_Latn data_files: - split: taxi1500 path: ese_Latn/taxi1500/*.arrow - config_name: esk_Latn data_files: - split: taxi1500 path: esk_Latn/taxi1500/*.arrow - config_name: etr_Latn data_files: - split: taxi1500 path: etr_Latn/taxi1500/*.arrow - config_name: eus_Latn data_files: - split: taxi1500 path: eus_Latn/taxi1500/*.arrow - config_name: ewe_Latn data_files: - split: taxi1500 path: ewe_Latn/taxi1500/*.arrow - config_name: faa_Latn data_files: - split: taxi1500 path: faa_Latn/taxi1500/*.arrow - config_name: fai_Latn data_files: - split: taxi1500 path: fai_Latn/taxi1500/*.arrow - config_name: far_Latn data_files: - split: taxi1500 path: far_Latn/taxi1500/*.arrow - config_name: ffm_Latn data_files: - split: taxi1500 path: ffm_Latn/taxi1500/*.arrow - config_name: fil_Latn data_files: - split: taxi1500 path: fil_Latn/taxi1500/*.arrow - config_name: fin_Latn data_files: - split: taxi1500 path: fin_Latn/taxi1500/*.arrow - config_name: for_Latn data_files: - split: taxi1500 path: for_Latn/taxi1500/*.arrow - config_name: fra_Latn data_files: - split: taxi1500 path: fra_Latn/taxi1500/*.arrow - config_name: fue_Latn data_files: - split: taxi1500 path: fue_Latn/taxi1500/*.arrow - config_name: fuf_Latn data_files: - split: taxi1500 path: fuf_Latn/taxi1500/*.arrow - config_name: fuh_Latn data_files: - split: taxi1500 path: fuh_Latn/taxi1500/*.arrow - config_name: gah_Latn data_files: - split: taxi1500 path: gah_Latn/taxi1500/*.arrow - config_name: gai_Latn data_files: - split: taxi1500 path: gai_Latn/taxi1500/*.arrow - config_name: gam_Latn data_files: - split: taxi1500 path: gam_Latn/taxi1500/*.arrow - config_name: gaq_Orya data_files: - split: taxi1500 path: gaq_Orya/taxi1500/*.arrow - config_name: gaw_Latn data_files: - split: taxi1500 path: gaw_Latn/taxi1500/*.arrow - config_name: gaz_Latn data_files: - split: taxi1500 path: gaz_Latn/taxi1500/*.arrow - config_name: gdn_Latn data_files: - split: taxi1500 path: gdn_Latn/taxi1500/*.arrow - config_name: gdr_Latn data_files: - split: taxi1500 path: gdr_Latn/taxi1500/*.arrow - config_name: geb_Latn data_files: - split: taxi1500 path: geb_Latn/taxi1500/*.arrow - config_name: gfk_Latn data_files: - split: taxi1500 path: gfk_Latn/taxi1500/*.arrow - config_name: ghs_Latn data_files: - split: taxi1500 path: ghs_Latn/taxi1500/*.arrow - config_name: gia_Latn data_files: - split: taxi1500 path: gia_Latn/taxi1500/*.arrow - config_name: gla_Latn data_files: - split: taxi1500 path: gla_Latn/taxi1500/*.arrow - config_name: glk_Arab data_files: - split: taxi1500 path: glk_Arab/taxi1500/*.arrow - config_name: glv_Latn data_files: - split: taxi1500 path: glv_Latn/taxi1500/*.arrow - config_name: gmv_Ethi data_files: - split: taxi1500 path: gmv_Ethi/taxi1500/*.arrow - config_name: gmv_Latn data_files: - split: taxi1500 path: gmv_Latn/taxi1500/*.arrow - config_name: gng_Latn data_files: - split: taxi1500 path: gng_Latn/taxi1500/*.arrow - config_name: gnn_Latn data_files: - split: taxi1500 path: gnn_Latn/taxi1500/*.arrow - config_name: gnw_Latn data_files: - split: taxi1500 path: gnw_Latn/taxi1500/*.arrow - config_name: gof_Ethi data_files: - split: taxi1500 path: gof_Ethi/taxi1500/*.arrow - config_name: gof_Latn data_files: - split: taxi1500 path: gof_Latn/taxi1500/*.arrow - config_name: got_Latn data_files: - split: taxi1500 path: got_Latn/taxi1500/*.arrow - config_name: gqr_Latn data_files: - split: taxi1500 path: gqr_Latn/taxi1500/*.arrow - config_name: grc_Grek data_files: - split: taxi1500 path: grc_Grek/taxi1500/*.arrow - config_name: gub_Latn data_files: - split: taxi1500 path: gub_Latn/taxi1500/*.arrow - config_name: guc_Latn data_files: - split: taxi1500 path: guc_Latn/taxi1500/*.arrow - config_name: gue_Latn data_files: - split: taxi1500 path: gue_Latn/taxi1500/*.arrow - config_name: guh_Latn data_files: - split: taxi1500 path: guh_Latn/taxi1500/*.arrow - config_name: gui_Latn data_files: - split: taxi1500 path: gui_Latn/taxi1500/*.arrow - config_name: guj_Gujr data_files: - split: taxi1500 path: guj_Gujr/taxi1500/*.arrow - config_name: gul_Latn data_files: - split: taxi1500 path: gul_Latn/taxi1500/*.arrow - config_name: gum_Latn data_files: - split: taxi1500 path: gum_Latn/taxi1500/*.arrow - config_name: gun_Latn data_files: - split: taxi1500 path: gun_Latn/taxi1500/*.arrow - config_name: guo_Latn data_files: - split: taxi1500 path: guo_Latn/taxi1500/*.arrow - config_name: gup_Latn data_files: - split: taxi1500 path: gup_Latn/taxi1500/*.arrow - config_name: gux_Latn data_files: - split: taxi1500 path: gux_Latn/taxi1500/*.arrow - config_name: gvc_Latn data_files: - split: taxi1500 path: gvc_Latn/taxi1500/*.arrow - config_name: gvf_Latn data_files: - split: taxi1500 path: gvf_Latn/taxi1500/*.arrow - config_name: gvn_Latn data_files: - split: taxi1500 path: gvn_Latn/taxi1500/*.arrow - config_name: gvs_Latn data_files: - split: taxi1500 path: gvs_Latn/taxi1500/*.arrow - config_name: gwi_Latn data_files: - split: taxi1500 path: gwi_Latn/taxi1500/*.arrow - config_name: gym_Latn data_files: - split: taxi1500 path: gym_Latn/taxi1500/*.arrow - config_name: gyr_Latn data_files: - split: taxi1500 path: gyr_Latn/taxi1500/*.arrow - config_name: hat_Latn data_files: - split: taxi1500 path: hat_Latn/taxi1500/*.arrow - config_name: hau_Latn data_files: - split: taxi1500 path: hau_Latn/taxi1500/*.arrow - config_name: haw_Latn data_files: - split: taxi1500 path: haw_Latn/taxi1500/*.arrow - config_name: hbo_Hebr data_files: - split: taxi1500 path: hbo_Hebr/taxi1500/*.arrow - config_name: hch_Latn data_files: - split: taxi1500 path: hch_Latn/taxi1500/*.arrow - config_name: heb_Hebr data_files: - split: taxi1500 path: heb_Hebr/taxi1500/*.arrow - config_name: heg_Latn data_files: - split: taxi1500 path: heg_Latn/taxi1500/*.arrow - config_name: hin_Deva data_files: - split: taxi1500 path: hin_Deva/taxi1500/*.arrow - config_name: hix_Latn data_files: - split: taxi1500 path: hix_Latn/taxi1500/*.arrow - config_name: hla_Latn data_files: - split: taxi1500 path: hla_Latn/taxi1500/*.arrow - config_name: hlt_Latn data_files: - split: taxi1500 path: hlt_Latn/taxi1500/*.arrow - config_name: hmo_Latn data_files: - split: taxi1500 path: hmo_Latn/taxi1500/*.arrow - config_name: hns_Latn data_files: - split: taxi1500 path: hns_Latn/taxi1500/*.arrow - config_name: hop_Latn data_files: - split: taxi1500 path: hop_Latn/taxi1500/*.arrow - config_name: hot_Latn data_files: - split: taxi1500 path: hot_Latn/taxi1500/*.arrow - config_name: hoy_Deva data_files: - split: taxi1500 path: hoy_Deva/taxi1500/*.arrow - config_name: hrv_Latn data_files: - split: taxi1500 path: hrv_Latn/taxi1500/*.arrow - config_name: hto_Latn data_files: - split: taxi1500 path: hto_Latn/taxi1500/*.arrow - config_name: hub_Latn data_files: - split: taxi1500 path: hub_Latn/taxi1500/*.arrow - config_name: hui_Latn data_files: - split: taxi1500 path: hui_Latn/taxi1500/*.arrow - config_name: hun_Latn data_files: - split: taxi1500 path: hun_Latn/taxi1500/*.arrow - config_name: hus_Latn data_files: - split: taxi1500 path: hus_Latn/taxi1500/*.arrow - config_name: huu_Latn data_files: - split: taxi1500 path: huu_Latn/taxi1500/*.arrow - config_name: huv_Latn data_files: - split: taxi1500 path: huv_Latn/taxi1500/*.arrow - config_name: hvn_Latn data_files: - split: taxi1500 path: hvn_Latn/taxi1500/*.arrow - config_name: hwc_Latn data_files: - split: taxi1500 path: hwc_Latn/taxi1500/*.arrow - config_name: ian_Latn data_files: - split: taxi1500 path: ian_Latn/taxi1500/*.arrow - config_name: ibo_Latn data_files: - split: taxi1500 path: ibo_Latn/taxi1500/*.arrow - config_name: ign_Latn data_files: - split: taxi1500 path: ign_Latn/taxi1500/*.arrow - config_name: ikk_Latn data_files: - split: taxi1500 path: ikk_Latn/taxi1500/*.arrow - config_name: ikw_Latn data_files: - split: taxi1500 path: ikw_Latn/taxi1500/*.arrow - config_name: ilo_Latn data_files: - split: taxi1500 path: ilo_Latn/taxi1500/*.arrow - config_name: imo_Latn data_files: - split: taxi1500 path: imo_Latn/taxi1500/*.arrow - config_name: inb_Latn data_files: - split: taxi1500 path: inb_Latn/taxi1500/*.arrow - config_name: ind_Latn data_files: - split: taxi1500 path: ind_Latn/taxi1500/*.arrow - config_name: ino_Latn data_files: - split: taxi1500 path: ino_Latn/taxi1500/*.arrow - config_name: iou_Latn data_files: - split: taxi1500 path: iou_Latn/taxi1500/*.arrow - config_name: ipi_Latn data_files: - split: taxi1500 path: ipi_Latn/taxi1500/*.arrow - config_name: isl_Latn data_files: - split: taxi1500 path: isl_Latn/taxi1500/*.arrow - config_name: isn_Latn data_files: - split: taxi1500 path: isn_Latn/taxi1500/*.arrow - config_name: ita_Latn data_files: - split: taxi1500 path: ita_Latn/taxi1500/*.arrow - config_name: iws_Latn data_files: - split: taxi1500 path: iws_Latn/taxi1500/*.arrow - config_name: ixl_Latn data_files: - split: taxi1500 path: ixl_Latn/taxi1500/*.arrow - config_name: jac_Latn data_files: - split: taxi1500 path: jac_Latn/taxi1500/*.arrow - config_name: jae_Latn data_files: - split: taxi1500 path: jae_Latn/taxi1500/*.arrow - config_name: jao_Latn data_files: - split: taxi1500 path: jao_Latn/taxi1500/*.arrow - config_name: jic_Latn data_files: - split: taxi1500 path: jic_Latn/taxi1500/*.arrow - config_name: jid_Latn data_files: - split: taxi1500 path: jid_Latn/taxi1500/*.arrow - config_name: jiv_Latn data_files: - split: taxi1500 path: jiv_Latn/taxi1500/*.arrow - config_name: jni_Latn data_files: - split: taxi1500 path: jni_Latn/taxi1500/*.arrow - config_name: jpn_Jpan data_files: - split: taxi1500 path: jpn_Jpan/taxi1500/*.arrow - config_name: juy_Orya data_files: - split: taxi1500 path: juy_Orya/taxi1500/*.arrow - config_name: jvn_Latn data_files: - split: taxi1500 path: jvn_Latn/taxi1500/*.arrow - config_name: kan_Knda data_files: - split: taxi1500 path: kan_Knda/taxi1500/*.arrow - config_name: kan_Latn data_files: - split: taxi1500 path: kan_Latn/taxi1500/*.arrow - config_name: kaq_Latn data_files: - split: taxi1500 path: kaq_Latn/taxi1500/*.arrow - config_name: kbc_Latn data_files: - split: taxi1500 path: kbc_Latn/taxi1500/*.arrow - config_name: kbh_Latn data_files: - split: taxi1500 path: kbh_Latn/taxi1500/*.arrow - config_name: kbm_Latn data_files: - split: taxi1500 path: kbm_Latn/taxi1500/*.arrow - config_name: kbq_Latn data_files: - split: taxi1500 path: kbq_Latn/taxi1500/*.arrow - config_name: kca_Cyrl data_files: - split: taxi1500 path: kca_Cyrl/taxi1500/*.arrow - config_name: kdc_Latn data_files: - split: taxi1500 path: kdc_Latn/taxi1500/*.arrow - config_name: kde_Latn data_files: - split: taxi1500 path: kde_Latn/taxi1500/*.arrow - config_name: kdl_Latn data_files: - split: taxi1500 path: kdl_Latn/taxi1500/*.arrow - config_name: kek_Latn data_files: - split: taxi1500 path: kek_Latn/taxi1500/*.arrow - config_name: ken_Latn data_files: - split: taxi1500 path: ken_Latn/taxi1500/*.arrow - config_name: kew_Latn data_files: - split: taxi1500 path: kew_Latn/taxi1500/*.arrow - config_name: kfw_Latn data_files: - split: taxi1500 path: kfw_Latn/taxi1500/*.arrow - config_name: kgf_Latn data_files: - split: taxi1500 path: kgf_Latn/taxi1500/*.arrow - config_name: kgk_Latn data_files: - split: taxi1500 path: kgk_Latn/taxi1500/*.arrow - config_name: kgp_Latn data_files: - split: taxi1500 path: kgp_Latn/taxi1500/*.arrow - config_name: khs_Latn data_files: - split: taxi1500 path: khs_Latn/taxi1500/*.arrow - config_name: khz_Latn data_files: - split: taxi1500 path: khz_Latn/taxi1500/*.arrow - config_name: kij_Latn data_files: - split: taxi1500 path: kij_Latn/taxi1500/*.arrow - config_name: kik_Latn data_files: - split: taxi1500 path: kik_Latn/taxi1500/*.arrow - config_name: kiw_Latn data_files: - split: taxi1500 path: kiw_Latn/taxi1500/*.arrow - config_name: kiz_Latn data_files: - split: taxi1500 path: kiz_Latn/taxi1500/*.arrow - config_name: kje_Latn data_files: - split: taxi1500 path: kje_Latn/taxi1500/*.arrow - config_name: kjn_Latn data_files: - split: taxi1500 path: kjn_Latn/taxi1500/*.arrow - config_name: kjs_Latn data_files: - split: taxi1500 path: kjs_Latn/taxi1500/*.arrow - config_name: kkc_Latn data_files: - split: taxi1500 path: kkc_Latn/taxi1500/*.arrow - config_name: kkl_Latn data_files: - split: taxi1500 path: kkl_Latn/taxi1500/*.arrow - config_name: kky_Latn data_files: - split: taxi1500 path: kky_Latn/taxi1500/*.arrow - config_name: klt_Latn data_files: - split: taxi1500 path: klt_Latn/taxi1500/*.arrow - config_name: klv_Latn data_files: - split: taxi1500 path: klv_Latn/taxi1500/*.arrow - config_name: kmg_Latn data_files: - split: taxi1500 path: kmg_Latn/taxi1500/*.arrow - config_name: kmh_Latn data_files: - split: taxi1500 path: kmh_Latn/taxi1500/*.arrow - config_name: kmk_Latn data_files: - split: taxi1500 path: kmk_Latn/taxi1500/*.arrow - config_name: kmo_Latn data_files: - split: taxi1500 path: kmo_Latn/taxi1500/*.arrow - config_name: kms_Latn data_files: - split: taxi1500 path: kms_Latn/taxi1500/*.arrow - config_name: kmu_Latn data_files: - split: taxi1500 path: kmu_Latn/taxi1500/*.arrow - config_name: kne_Latn data_files: - split: taxi1500 path: kne_Latn/taxi1500/*.arrow - config_name: knf_Latn data_files: - split: taxi1500 path: knf_Latn/taxi1500/*.arrow - config_name: knj_Latn data_files: - split: taxi1500 path: knj_Latn/taxi1500/*.arrow - config_name: knv_Latn data_files: - split: taxi1500 path: knv_Latn/taxi1500/*.arrow - config_name: kos_Latn data_files: - split: taxi1500 path: kos_Latn/taxi1500/*.arrow - config_name: kpf_Latn data_files: - split: taxi1500 path: kpf_Latn/taxi1500/*.arrow - config_name: kpg_Latn data_files: - split: taxi1500 path: kpg_Latn/taxi1500/*.arrow - config_name: kpj_Latn data_files: - split: taxi1500 path: kpj_Latn/taxi1500/*.arrow - config_name: kpr_Latn data_files: - split: taxi1500 path: kpr_Latn/taxi1500/*.arrow - config_name: kpw_Latn data_files: - split: taxi1500 path: kpw_Latn/taxi1500/*.arrow - config_name: kpx_Latn data_files: - split: taxi1500 path: kpx_Latn/taxi1500/*.arrow - config_name: kqa_Latn data_files: - split: taxi1500 path: kqa_Latn/taxi1500/*.arrow - config_name: kqc_Latn data_files: - split: taxi1500 path: kqc_Latn/taxi1500/*.arrow - config_name: kqf_Latn data_files: - split: taxi1500 path: kqf_Latn/taxi1500/*.arrow - config_name: kql_Latn data_files: - split: taxi1500 path: kql_Latn/taxi1500/*.arrow - config_name: kqw_Latn data_files: - split: taxi1500 path: kqw_Latn/taxi1500/*.arrow - config_name: ksd_Latn data_files: - split: taxi1500 path: ksd_Latn/taxi1500/*.arrow - config_name: ksj_Latn data_files: - split: taxi1500 path: ksj_Latn/taxi1500/*.arrow - config_name: ksr_Latn data_files: - split: taxi1500 path: ksr_Latn/taxi1500/*.arrow - config_name: ksw_Mymr data_files: - split: taxi1500 path: ksw_Mymr/taxi1500/*.arrow - config_name: ktm_Latn data_files: - split: taxi1500 path: ktm_Latn/taxi1500/*.arrow - config_name: kto_Latn data_files: - split: taxi1500 path: kto_Latn/taxi1500/*.arrow - config_name: kud_Latn data_files: - split: taxi1500 path: kud_Latn/taxi1500/*.arrow - config_name: kue_Latn data_files: - split: taxi1500 path: kue_Latn/taxi1500/*.arrow - config_name: kup_Latn data_files: - split: taxi1500 path: kup_Latn/taxi1500/*.arrow - config_name: kux_Latn data_files: - split: taxi1500 path: kux_Latn/taxi1500/*.arrow - config_name: kvg_Latn data_files: - split: taxi1500 path: kvg_Latn/taxi1500/*.arrow - config_name: kvn_Latn data_files: - split: taxi1500 path: kvn_Latn/taxi1500/*.arrow - config_name: kwd_Latn data_files: - split: taxi1500 path: kwd_Latn/taxi1500/*.arrow - config_name: kwf_Latn data_files: - split: taxi1500 path: kwf_Latn/taxi1500/*.arrow - config_name: kwi_Latn data_files: - split: taxi1500 path: kwi_Latn/taxi1500/*.arrow - config_name: kwj_Latn data_files: - split: taxi1500 path: kwj_Latn/taxi1500/*.arrow - config_name: kxv_Orya data_files: - split: taxi1500 path: kxv_Orya/taxi1500/*.arrow - config_name: kyc_Latn data_files: - split: taxi1500 path: kyc_Latn/taxi1500/*.arrow - config_name: kyf_Latn data_files: - split: taxi1500 path: kyf_Latn/taxi1500/*.arrow - config_name: kyg_Latn data_files: - split: taxi1500 path: kyg_Latn/taxi1500/*.arrow - config_name: kyq_Latn data_files: - split: taxi1500 path: kyq_Latn/taxi1500/*.arrow - config_name: kyz_Latn data_files: - split: taxi1500 path: kyz_Latn/taxi1500/*.arrow - config_name: kze_Latn data_files: - split: taxi1500 path: kze_Latn/taxi1500/*.arrow - config_name: lac_Latn data_files: - split: taxi1500 path: lac_Latn/taxi1500/*.arrow - config_name: lat_Latn data_files: - split: taxi1500 path: lat_Latn/taxi1500/*.arrow - config_name: lbb_Latn data_files: - split: taxi1500 path: lbb_Latn/taxi1500/*.arrow - config_name: lbk_Latn data_files: - split: taxi1500 path: lbk_Latn/taxi1500/*.arrow - config_name: lbm_Deva data_files: - split: taxi1500 path: lbm_Deva/taxi1500/*.arrow - config_name: lcm_Latn data_files: - split: taxi1500 path: lcm_Latn/taxi1500/*.arrow - config_name: leu_Latn data_files: - split: taxi1500 path: leu_Latn/taxi1500/*.arrow - config_name: lex_Latn data_files: - split: taxi1500 path: lex_Latn/taxi1500/*.arrow - config_name: lgl_Latn data_files: - split: taxi1500 path: lgl_Latn/taxi1500/*.arrow - config_name: lid_Latn data_files: - split: taxi1500 path: lid_Latn/taxi1500/*.arrow - config_name: lif_Deva data_files: - split: taxi1500 path: lif_Deva/taxi1500/*.arrow - config_name: lif_Limb data_files: - split: taxi1500 path: lif_Limb/taxi1500/*.arrow - config_name: lin_Latn data_files: - split: taxi1500 path: lin_Latn/taxi1500/*.arrow - config_name: lit_Latn data_files: - split: taxi1500 path: lit_Latn/taxi1500/*.arrow - config_name: llg_Latn data_files: - split: taxi1500 path: llg_Latn/taxi1500/*.arrow - config_name: lrg_Latn data_files: - split: taxi1500 path: lrg_Latn/taxi1500/*.arrow - config_name: lug_Latn data_files: - split: taxi1500 path: lug_Latn/taxi1500/*.arrow - config_name: luo_Latn data_files: - split: taxi1500 path: luo_Latn/taxi1500/*.arrow - config_name: lww_Latn data_files: - split: taxi1500 path: lww_Latn/taxi1500/*.arrow - config_name: lzh_Hani data_files: - split: taxi1500 path: lzh_Hani/taxi1500/*.arrow - config_name: maa_Latn data_files: - split: taxi1500 path: maa_Latn/taxi1500/*.arrow - config_name: maj_Latn data_files: - split: taxi1500 path: maj_Latn/taxi1500/*.arrow - config_name: mal_Mlym data_files: - split: taxi1500 path: mal_Mlym/taxi1500/*.arrow - config_name: mam_Latn data_files: - split: taxi1500 path: mam_Latn/taxi1500/*.arrow - config_name: maq_Latn data_files: - split: taxi1500 path: maq_Latn/taxi1500/*.arrow - config_name: mar_Deva data_files: - split: taxi1500 path: mar_Deva/taxi1500/*.arrow - config_name: mau_Latn data_files: - split: taxi1500 path: mau_Latn/taxi1500/*.arrow - config_name: mav_Latn data_files: - split: taxi1500 path: mav_Latn/taxi1500/*.arrow - config_name: maz_Latn data_files: - split: taxi1500 path: maz_Latn/taxi1500/*.arrow - config_name: mbb_Latn data_files: - split: taxi1500 path: mbb_Latn/taxi1500/*.arrow - config_name: mbc_Latn data_files: - split: taxi1500 path: mbc_Latn/taxi1500/*.arrow - config_name: mbh_Latn data_files: - split: taxi1500 path: mbh_Latn/taxi1500/*.arrow - config_name: mbj_Latn data_files: - split: taxi1500 path: mbj_Latn/taxi1500/*.arrow - config_name: mbl_Latn data_files: - split: taxi1500 path: mbl_Latn/taxi1500/*.arrow - config_name: mbs_Latn data_files: - split: taxi1500 path: mbs_Latn/taxi1500/*.arrow - config_name: mbt_Latn data_files: - split: taxi1500 path: mbt_Latn/taxi1500/*.arrow - config_name: mca_Latn data_files: - split: taxi1500 path: mca_Latn/taxi1500/*.arrow - config_name: mcb_Latn data_files: - split: taxi1500 path: mcb_Latn/taxi1500/*.arrow - config_name: mcd_Latn data_files: - split: taxi1500 path: mcd_Latn/taxi1500/*.arrow - config_name: mcf_Latn data_files: - split: taxi1500 path: mcf_Latn/taxi1500/*.arrow - config_name: mco_Latn data_files: - split: taxi1500 path: mco_Latn/taxi1500/*.arrow - config_name: mcp_Latn data_files: - split: taxi1500 path: mcp_Latn/taxi1500/*.arrow - config_name: mcq_Latn data_files: - split: taxi1500 path: mcq_Latn/taxi1500/*.arrow - config_name: mcr_Latn data_files: - split: taxi1500 path: mcr_Latn/taxi1500/*.arrow - config_name: mdy_Ethi data_files: - split: taxi1500 path: mdy_Ethi/taxi1500/*.arrow - config_name: med_Latn data_files: - split: taxi1500 path: med_Latn/taxi1500/*.arrow - config_name: mee_Latn data_files: - split: taxi1500 path: mee_Latn/taxi1500/*.arrow - config_name: mek_Latn data_files: - split: taxi1500 path: mek_Latn/taxi1500/*.arrow - config_name: meq_Latn data_files: - split: taxi1500 path: meq_Latn/taxi1500/*.arrow - config_name: met_Latn data_files: - split: taxi1500 path: met_Latn/taxi1500/*.arrow - config_name: meu_Latn data_files: - split: taxi1500 path: meu_Latn/taxi1500/*.arrow - config_name: mfy_Latn data_files: - split: taxi1500 path: mfy_Latn/taxi1500/*.arrow - config_name: mgc_Latn data_files: - split: taxi1500 path: mgc_Latn/taxi1500/*.arrow - config_name: mgh_Latn data_files: - split: taxi1500 path: mgh_Latn/taxi1500/*.arrow - config_name: mgw_Latn data_files: - split: taxi1500 path: mgw_Latn/taxi1500/*.arrow - config_name: mib_Latn data_files: - split: taxi1500 path: mib_Latn/taxi1500/*.arrow - config_name: mic_Latn data_files: - split: taxi1500 path: mic_Latn/taxi1500/*.arrow - config_name: mie_Latn data_files: - split: taxi1500 path: mie_Latn/taxi1500/*.arrow - config_name: mig_Latn data_files: - split: taxi1500 path: mig_Latn/taxi1500/*.arrow - config_name: mih_Latn data_files: - split: taxi1500 path: mih_Latn/taxi1500/*.arrow - config_name: mil_Latn data_files: - split: taxi1500 path: mil_Latn/taxi1500/*.arrow - config_name: mio_Latn data_files: - split: taxi1500 path: mio_Latn/taxi1500/*.arrow - config_name: mir_Latn data_files: - split: taxi1500 path: mir_Latn/taxi1500/*.arrow - config_name: mit_Latn data_files: - split: taxi1500 path: mit_Latn/taxi1500/*.arrow - config_name: miz_Latn data_files: - split: taxi1500 path: miz_Latn/taxi1500/*.arrow - config_name: mjc_Latn data_files: - split: taxi1500 path: mjc_Latn/taxi1500/*.arrow - config_name: mkj_Latn data_files: - split: taxi1500 path: mkj_Latn/taxi1500/*.arrow - config_name: mkl_Latn data_files: - split: taxi1500 path: mkl_Latn/taxi1500/*.arrow - config_name: mkn_Latn data_files: - split: taxi1500 path: mkn_Latn/taxi1500/*.arrow - config_name: mks_Latn data_files: - split: taxi1500 path: mks_Latn/taxi1500/*.arrow - config_name: mle_Latn data_files: - split: taxi1500 path: mle_Latn/taxi1500/*.arrow - config_name: mlh_Latn data_files: - split: taxi1500 path: mlh_Latn/taxi1500/*.arrow - config_name: mlp_Latn data_files: - split: taxi1500 path: mlp_Latn/taxi1500/*.arrow - config_name: mmo_Latn data_files: - split: taxi1500 path: mmo_Latn/taxi1500/*.arrow - config_name: mmx_Latn data_files: - split: taxi1500 path: mmx_Latn/taxi1500/*.arrow - config_name: mna_Latn data_files: - split: taxi1500 path: mna_Latn/taxi1500/*.arrow - config_name: mni_Latn data_files: - split: taxi1500 path: mni_Latn/taxi1500/*.arrow - config_name: moh_Latn data_files: - split: taxi1500 path: moh_Latn/taxi1500/*.arrow - config_name: mop_Latn data_files: - split: taxi1500 path: mop_Latn/taxi1500/*.arrow - config_name: mox_Latn data_files: - split: taxi1500 path: mox_Latn/taxi1500/*.arrow - config_name: mph_Latn data_files: - split: taxi1500 path: mph_Latn/taxi1500/*.arrow - config_name: mpj_Latn data_files: - split: taxi1500 path: mpj_Latn/taxi1500/*.arrow - config_name: mpm_Latn data_files: - split: taxi1500 path: mpm_Latn/taxi1500/*.arrow - config_name: mpp_Latn data_files: - split: taxi1500 path: mpp_Latn/taxi1500/*.arrow - config_name: mps_Latn data_files: - split: taxi1500 path: mps_Latn/taxi1500/*.arrow - config_name: mpt_Latn data_files: - split: taxi1500 path: mpt_Latn/taxi1500/*.arrow - config_name: mpx_Latn data_files: - split: taxi1500 path: mpx_Latn/taxi1500/*.arrow - config_name: mqb_Latn data_files: - split: taxi1500 path: mqb_Latn/taxi1500/*.arrow - config_name: mqj_Latn data_files: - split: taxi1500 path: mqj_Latn/taxi1500/*.arrow - config_name: msa_Latn data_files: - split: taxi1500 path: msa_Latn/taxi1500/*.arrow - config_name: msb_Latn data_files: - split: taxi1500 path: msb_Latn/taxi1500/*.arrow - config_name: msc_Latn data_files: - split: taxi1500 path: msc_Latn/taxi1500/*.arrow - config_name: msk_Latn data_files: - split: taxi1500 path: msk_Latn/taxi1500/*.arrow - config_name: msm_Latn data_files: - split: taxi1500 path: msm_Latn/taxi1500/*.arrow - config_name: msy_Latn data_files: - split: taxi1500 path: msy_Latn/taxi1500/*.arrow - config_name: mti_Latn data_files: - split: taxi1500 path: mti_Latn/taxi1500/*.arrow - config_name: mto_Latn data_files: - split: taxi1500 path: mto_Latn/taxi1500/*.arrow - config_name: mux_Latn data_files: - split: taxi1500 path: mux_Latn/taxi1500/*.arrow - config_name: muy_Latn data_files: - split: taxi1500 path: muy_Latn/taxi1500/*.arrow - config_name: mva_Latn data_files: - split: taxi1500 path: mva_Latn/taxi1500/*.arrow - config_name: mvn_Latn data_files: - split: taxi1500 path: mvn_Latn/taxi1500/*.arrow - config_name: mwc_Latn data_files: - split: taxi1500 path: mwc_Latn/taxi1500/*.arrow - config_name: mwe_Latn data_files: - split: taxi1500 path: mwe_Latn/taxi1500/*.arrow - config_name: mwf_Latn data_files: - split: taxi1500 path: mwf_Latn/taxi1500/*.arrow - config_name: mwp_Latn data_files: - split: taxi1500 path: mwp_Latn/taxi1500/*.arrow - config_name: mxb_Latn data_files: - split: taxi1500 path: mxb_Latn/taxi1500/*.arrow - config_name: mxp_Latn data_files: - split: taxi1500 path: mxp_Latn/taxi1500/*.arrow - config_name: mxq_Latn data_files: - split: taxi1500 path: mxq_Latn/taxi1500/*.arrow - config_name: mxt_Latn data_files: - split: taxi1500 path: mxt_Latn/taxi1500/*.arrow - config_name: mya_Mymr data_files: - split: taxi1500 path: mya_Mymr/taxi1500/*.arrow - config_name: myk_Latn data_files: - split: taxi1500 path: myk_Latn/taxi1500/*.arrow - config_name: myu_Latn data_files: - split: taxi1500 path: myu_Latn/taxi1500/*.arrow - config_name: myw_Latn data_files: - split: taxi1500 path: myw_Latn/taxi1500/*.arrow - config_name: myy_Latn data_files: - split: taxi1500 path: myy_Latn/taxi1500/*.arrow - config_name: mzz_Latn data_files: - split: taxi1500 path: mzz_Latn/taxi1500/*.arrow - config_name: nab_Latn data_files: - split: taxi1500 path: nab_Latn/taxi1500/*.arrow - config_name: naf_Latn data_files: - split: taxi1500 path: naf_Latn/taxi1500/*.arrow - config_name: nag_Latn data_files: - split: taxi1500 path: nag_Latn/taxi1500/*.arrow - config_name: nak_Latn data_files: - split: taxi1500 path: nak_Latn/taxi1500/*.arrow - config_name: nas_Latn data_files: - split: taxi1500 path: nas_Latn/taxi1500/*.arrow - config_name: nay_Latn data_files: - split: taxi1500 path: nay_Latn/taxi1500/*.arrow - config_name: nbq_Latn data_files: - split: taxi1500 path: nbq_Latn/taxi1500/*.arrow - config_name: nca_Latn data_files: - split: taxi1500 path: nca_Latn/taxi1500/*.arrow - config_name: nce_Latn data_files: - split: taxi1500 path: nce_Latn/taxi1500/*.arrow - config_name: nch_Latn data_files: - split: taxi1500 path: nch_Latn/taxi1500/*.arrow - config_name: ncj_Latn data_files: - split: taxi1500 path: ncj_Latn/taxi1500/*.arrow - config_name: ncl_Latn data_files: - split: taxi1500 path: ncl_Latn/taxi1500/*.arrow - config_name: ncu_Latn data_files: - split: taxi1500 path: ncu_Latn/taxi1500/*.arrow - config_name: nde_Latn data_files: - split: taxi1500 path: nde_Latn/taxi1500/*.arrow - config_name: ndg_Latn data_files: - split: taxi1500 path: ndg_Latn/taxi1500/*.arrow - config_name: ndj_Latn data_files: - split: taxi1500 path: ndj_Latn/taxi1500/*.arrow - config_name: nfa_Latn data_files: - split: taxi1500 path: nfa_Latn/taxi1500/*.arrow - config_name: ngp_Latn data_files: - split: taxi1500 path: ngp_Latn/taxi1500/*.arrow - config_name: ngu_Latn data_files: - split: taxi1500 path: ngu_Latn/taxi1500/*.arrow - config_name: nhe_Latn data_files: - split: taxi1500 path: nhe_Latn/taxi1500/*.arrow - config_name: nhg_Latn data_files: - split: taxi1500 path: nhg_Latn/taxi1500/*.arrow - config_name: nhi_Latn data_files: - split: taxi1500 path: nhi_Latn/taxi1500/*.arrow - config_name: nho_Latn data_files: - split: taxi1500 path: nho_Latn/taxi1500/*.arrow - config_name: nhr_Latn data_files: - split: taxi1500 path: nhr_Latn/taxi1500/*.arrow - config_name: nhu_Latn data_files: - split: taxi1500 path: nhu_Latn/taxi1500/*.arrow - config_name: nhw_Latn data_files: - split: taxi1500 path: nhw_Latn/taxi1500/*.arrow - config_name: nhy_Latn data_files: - split: taxi1500 path: nhy_Latn/taxi1500/*.arrow - config_name: nif_Latn data_files: - split: taxi1500 path: nif_Latn/taxi1500/*.arrow - config_name: nii_Latn data_files: - split: taxi1500 path: nii_Latn/taxi1500/*.arrow - config_name: nin_Latn data_files: - split: taxi1500 path: nin_Latn/taxi1500/*.arrow - config_name: nko_Latn data_files: - split: taxi1500 path: nko_Latn/taxi1500/*.arrow - config_name: nlc_Latn data_files: - split: taxi1500 path: nlc_Latn/taxi1500/*.arrow - config_name: nld_Latn data_files: - split: taxi1500 path: nld_Latn/taxi1500/*.arrow - config_name: nlg_Latn data_files: - split: taxi1500 path: nlg_Latn/taxi1500/*.arrow - config_name: nlx_Deva data_files: - split: taxi1500 path: nlx_Deva/taxi1500/*.arrow - config_name: nmw_Latn data_files: - split: taxi1500 path: nmw_Latn/taxi1500/*.arrow - config_name: nna_Latn data_files: - split: taxi1500 path: nna_Latn/taxi1500/*.arrow - config_name: nno_Latn data_files: - split: taxi1500 path: nno_Latn/taxi1500/*.arrow - config_name: nnq_Latn data_files: - split: taxi1500 path: nnq_Latn/taxi1500/*.arrow - config_name: noa_Latn data_files: - split: taxi1500 path: noa_Latn/taxi1500/*.arrow - config_name: nob_Latn data_files: - split: taxi1500 path: nob_Latn/taxi1500/*.arrow - config_name: nog_Cyrl data_files: - split: taxi1500 path: nog_Cyrl/taxi1500/*.arrow - config_name: nop_Latn data_files: - split: taxi1500 path: nop_Latn/taxi1500/*.arrow - config_name: not_Latn data_files: - split: taxi1500 path: not_Latn/taxi1500/*.arrow - config_name: nou_Latn data_files: - split: taxi1500 path: nou_Latn/taxi1500/*.arrow - config_name: npi_Deva data_files: - split: taxi1500 path: npi_Deva/taxi1500/*.arrow - config_name: npl_Latn data_files: - split: taxi1500 path: npl_Latn/taxi1500/*.arrow - config_name: nrf_Latn data_files: - split: taxi1500 path: nrf_Latn/taxi1500/*.arrow - config_name: nsn_Latn data_files: - split: taxi1500 path: nsn_Latn/taxi1500/*.arrow - config_name: nss_Latn data_files: - split: taxi1500 path: nss_Latn/taxi1500/*.arrow - config_name: ntj_Latn data_files: - split: taxi1500 path: ntj_Latn/taxi1500/*.arrow - config_name: ntp_Latn data_files: - split: taxi1500 path: ntp_Latn/taxi1500/*.arrow - config_name: ntu_Latn data_files: - split: taxi1500 path: ntu_Latn/taxi1500/*.arrow - config_name: nuy_Latn data_files: - split: taxi1500 path: nuy_Latn/taxi1500/*.arrow - config_name: nvm_Latn data_files: - split: taxi1500 path: nvm_Latn/taxi1500/*.arrow - config_name: nwi_Latn data_files: - split: taxi1500 path: nwi_Latn/taxi1500/*.arrow - config_name: nya_Latn data_files: - split: taxi1500 path: nya_Latn/taxi1500/*.arrow - config_name: nys_Latn data_files: - split: taxi1500 path: nys_Latn/taxi1500/*.arrow - config_name: nyu_Latn data_files: - split: taxi1500 path: nyu_Latn/taxi1500/*.arrow - config_name: obo_Latn data_files: - split: taxi1500 path: obo_Latn/taxi1500/*.arrow - config_name: oji_Latn data_files: - split: taxi1500 path: oji_Latn/taxi1500/*.arrow - config_name: okv_Latn data_files: - split: taxi1500 path: okv_Latn/taxi1500/*.arrow - config_name: omb_Latn data_files: - split: taxi1500 path: omb_Latn/taxi1500/*.arrow - config_name: omw_Latn data_files: - split: taxi1500 path: omw_Latn/taxi1500/*.arrow - config_name: ong_Latn data_files: - split: taxi1500 path: ong_Latn/taxi1500/*.arrow - config_name: ons_Latn data_files: - split: taxi1500 path: ons_Latn/taxi1500/*.arrow - config_name: ood_Latn data_files: - split: taxi1500 path: ood_Latn/taxi1500/*.arrow - config_name: opm_Latn data_files: - split: taxi1500 path: opm_Latn/taxi1500/*.arrow - config_name: ory_Orya data_files: - split: taxi1500 path: ory_Orya/taxi1500/*.arrow - config_name: ote_Latn data_files: - split: taxi1500 path: ote_Latn/taxi1500/*.arrow - config_name: otm_Latn data_files: - split: taxi1500 path: otm_Latn/taxi1500/*.arrow - config_name: otn_Latn data_files: - split: taxi1500 path: otn_Latn/taxi1500/*.arrow - config_name: otq_Latn data_files: - split: taxi1500 path: otq_Latn/taxi1500/*.arrow - config_name: ots_Latn data_files: - split: taxi1500 path: ots_Latn/taxi1500/*.arrow - config_name: pab_Latn data_files: - split: taxi1500 path: pab_Latn/taxi1500/*.arrow - config_name: pad_Latn data_files: - split: taxi1500 path: pad_Latn/taxi1500/*.arrow - config_name: pah_Latn data_files: - split: taxi1500 path: pah_Latn/taxi1500/*.arrow - config_name: pan_Guru data_files: - split: taxi1500 path: pan_Guru/taxi1500/*.arrow - config_name: pao_Latn data_files: - split: taxi1500 path: pao_Latn/taxi1500/*.arrow - config_name: peg_Orya data_files: - split: taxi1500 path: peg_Orya/taxi1500/*.arrow - config_name: pes_Arab data_files: - split: taxi1500 path: pes_Arab/taxi1500/*.arrow - config_name: pib_Latn data_files: - split: taxi1500 path: pib_Latn/taxi1500/*.arrow - config_name: pio_Latn data_files: - split: taxi1500 path: pio_Latn/taxi1500/*.arrow - config_name: pir_Latn data_files: - split: taxi1500 path: pir_Latn/taxi1500/*.arrow - config_name: piu_Latn data_files: - split: taxi1500 path: piu_Latn/taxi1500/*.arrow - config_name: pjt_Latn data_files: - split: taxi1500 path: pjt_Latn/taxi1500/*.arrow - config_name: pls_Latn data_files: - split: taxi1500 path: pls_Latn/taxi1500/*.arrow - config_name: plt_Latn data_files: - split: taxi1500 path: plt_Latn/taxi1500/*.arrow - config_name: plu_Latn data_files: - split: taxi1500 path: plu_Latn/taxi1500/*.arrow - config_name: pma_Latn data_files: - split: taxi1500 path: pma_Latn/taxi1500/*.arrow - config_name: poe_Latn data_files: - split: taxi1500 path: poe_Latn/taxi1500/*.arrow - config_name: poh_Latn data_files: - split: taxi1500 path: poh_Latn/taxi1500/*.arrow - config_name: poi_Latn data_files: - split: taxi1500 path: poi_Latn/taxi1500/*.arrow - config_name: pol_Latn data_files: - split: taxi1500 path: pol_Latn/taxi1500/*.arrow - config_name: pon_Latn data_files: - split: taxi1500 path: pon_Latn/taxi1500/*.arrow - config_name: por_Latn data_files: - split: taxi1500 path: por_Latn/taxi1500/*.arrow - config_name: pot_Latn data_files: - split: taxi1500 path: pot_Latn/taxi1500/*.arrow - config_name: poy_Latn data_files: - split: taxi1500 path: poy_Latn/taxi1500/*.arrow - config_name: ppo_Latn data_files: - split: taxi1500 path: ppo_Latn/taxi1500/*.arrow - config_name: prf_Latn data_files: - split: taxi1500 path: prf_Latn/taxi1500/*.arrow - config_name: pri_Latn data_files: - split: taxi1500 path: pri_Latn/taxi1500/*.arrow - config_name: ptp_Latn data_files: - split: taxi1500 path: ptp_Latn/taxi1500/*.arrow - config_name: ptu_Latn data_files: - split: taxi1500 path: ptu_Latn/taxi1500/*.arrow - config_name: pwg_Latn data_files: - split: taxi1500 path: pwg_Latn/taxi1500/*.arrow - config_name: qub_Latn data_files: - split: taxi1500 path: qub_Latn/taxi1500/*.arrow - config_name: quc_Latn data_files: - split: taxi1500 path: quc_Latn/taxi1500/*.arrow - config_name: quf_Latn data_files: - split: taxi1500 path: quf_Latn/taxi1500/*.arrow - config_name: quh_Latn data_files: - split: taxi1500 path: quh_Latn/taxi1500/*.arrow - config_name: qul_Latn data_files: - split: taxi1500 path: qul_Latn/taxi1500/*.arrow - config_name: qup_Latn data_files: - split: taxi1500 path: qup_Latn/taxi1500/*.arrow - config_name: quw_Latn data_files: - split: taxi1500 path: quw_Latn/taxi1500/*.arrow - config_name: qvc_Latn data_files: - split: taxi1500 path: qvc_Latn/taxi1500/*.arrow - config_name: qve_Latn data_files: - split: taxi1500 path: qve_Latn/taxi1500/*.arrow - config_name: qvh_Latn data_files: - split: taxi1500 path: qvh_Latn/taxi1500/*.arrow - config_name: qvm_Latn data_files: - split: taxi1500 path: qvm_Latn/taxi1500/*.arrow - config_name: qvn_Latn data_files: - split: taxi1500 path: qvn_Latn/taxi1500/*.arrow - config_name: qvs_Latn data_files: - split: taxi1500 path: qvs_Latn/taxi1500/*.arrow - config_name: qvw_Latn data_files: - split: taxi1500 path: qvw_Latn/taxi1500/*.arrow - config_name: qvz_Latn data_files: - split: taxi1500 path: qvz_Latn/taxi1500/*.arrow - config_name: qwh_Latn data_files: - split: taxi1500 path: qwh_Latn/taxi1500/*.arrow - config_name: qxh_Latn data_files: - split: taxi1500 path: qxh_Latn/taxi1500/*.arrow - config_name: qxn_Latn data_files: - split: taxi1500 path: qxn_Latn/taxi1500/*.arrow - config_name: qxo_Latn data_files: - split: taxi1500 path: qxo_Latn/taxi1500/*.arrow - config_name: rai_Latn data_files: - split: taxi1500 path: rai_Latn/taxi1500/*.arrow - config_name: reg_Latn data_files: - split: taxi1500 path: reg_Latn/taxi1500/*.arrow - config_name: rgu_Latn data_files: - split: taxi1500 path: rgu_Latn/taxi1500/*.arrow - config_name: rkb_Latn data_files: - split: taxi1500 path: rkb_Latn/taxi1500/*.arrow - config_name: rmb_Latn data_files: - split: taxi1500 path: rmb_Latn/taxi1500/*.arrow - config_name: rmc_Cyrl data_files: - split: taxi1500 path: rmc_Cyrl/taxi1500/*.arrow - config_name: rmc_Latn data_files: - split: taxi1500 path: rmc_Latn/taxi1500/*.arrow - config_name: rmn_Cyrl data_files: - split: taxi1500 path: rmn_Cyrl/taxi1500/*.arrow - config_name: rmn_Latn data_files: - split: taxi1500 path: rmn_Latn/taxi1500/*.arrow - config_name: rmq_Latn data_files: - split: taxi1500 path: rmq_Latn/taxi1500/*.arrow - config_name: rmy_Cyrl data_files: - split: taxi1500 path: rmy_Cyrl/taxi1500/*.arrow - config_name: rmy_Latn data_files: - split: taxi1500 path: rmy_Latn/taxi1500/*.arrow - config_name: ron_Cyrl data_files: - split: taxi1500 path: ron_Cyrl/taxi1500/*.arrow - config_name: ron_Latn data_files: - split: taxi1500 path: ron_Latn/taxi1500/*.arrow - config_name: roo_Latn data_files: - split: taxi1500 path: roo_Latn/taxi1500/*.arrow - config_name: rop_Latn data_files: - split: taxi1500 path: rop_Latn/taxi1500/*.arrow - config_name: row_Latn data_files: - split: taxi1500 path: row_Latn/taxi1500/*.arrow - config_name: rro_Latn data_files: - split: taxi1500 path: rro_Latn/taxi1500/*.arrow - config_name: ruf_Latn data_files: - split: taxi1500 path: ruf_Latn/taxi1500/*.arrow - config_name: rug_Latn data_files: - split: taxi1500 path: rug_Latn/taxi1500/*.arrow - config_name: rup_Latn data_files: - split: taxi1500 path: rup_Latn/taxi1500/*.arrow - config_name: rus_Cyrl data_files: - split: taxi1500 path: rus_Cyrl/taxi1500/*.arrow - config_name: rwo_Latn data_files: - split: taxi1500 path: rwo_Latn/taxi1500/*.arrow - config_name: sab_Latn data_files: - split: taxi1500 path: sab_Latn/taxi1500/*.arrow - config_name: san_Arab data_files: - split: taxi1500 path: san_Arab/taxi1500/*.arrow - config_name: san_Beng data_files: - split: taxi1500 path: san_Beng/taxi1500/*.arrow - config_name: san_Deva data_files: - split: taxi1500 path: san_Deva/taxi1500/*.arrow - config_name: san_Gujr data_files: - split: taxi1500 path: san_Gujr/taxi1500/*.arrow - config_name: san_Guru data_files: - split: taxi1500 path: san_Guru/taxi1500/*.arrow - config_name: san_Khmr data_files: - split: taxi1500 path: san_Khmr/taxi1500/*.arrow - config_name: san_Knda data_files: - split: taxi1500 path: san_Knda/taxi1500/*.arrow - config_name: san_Latn data_files: - split: taxi1500 path: san_Latn/taxi1500/*.arrow - config_name: san_Mlym data_files: - split: taxi1500 path: san_Mlym/taxi1500/*.arrow - config_name: san_Mymr data_files: - split: taxi1500 path: san_Mymr/taxi1500/*.arrow - config_name: san_Orya data_files: - split: taxi1500 path: san_Orya/taxi1500/*.arrow - config_name: san_Sinh data_files: - split: taxi1500 path: san_Sinh/taxi1500/*.arrow - config_name: san_Taml data_files: - split: taxi1500 path: san_Taml/taxi1500/*.arrow - config_name: san_Telu data_files: - split: taxi1500 path: san_Telu/taxi1500/*.arrow - config_name: san_Thai data_files: - split: taxi1500 path: san_Thai/taxi1500/*.arrow - config_name: san_Tibt data_files: - split: taxi1500 path: san_Tibt/taxi1500/*.arrow - config_name: sbd_Latn data_files: - split: taxi1500 path: sbd_Latn/taxi1500/*.arrow - config_name: sbe_Latn data_files: - split: taxi1500 path: sbe_Latn/taxi1500/*.arrow - config_name: sbk_Latn data_files: - split: taxi1500 path: sbk_Latn/taxi1500/*.arrow - config_name: sbs_Latn data_files: - split: taxi1500 path: sbs_Latn/taxi1500/*.arrow - config_name: sby_Latn data_files: - split: taxi1500 path: sby_Latn/taxi1500/*.arrow - config_name: sch_Latn data_files: - split: taxi1500 path: sch_Latn/taxi1500/*.arrow - config_name: seh_Latn data_files: - split: taxi1500 path: seh_Latn/taxi1500/*.arrow - config_name: sey_Latn data_files: - split: taxi1500 path: sey_Latn/taxi1500/*.arrow - config_name: sgb_Latn data_files: - split: taxi1500 path: sgb_Latn/taxi1500/*.arrow - config_name: sgz_Latn data_files: - split: taxi1500 path: sgz_Latn/taxi1500/*.arrow - config_name: shj_Latn data_files: - split: taxi1500 path: shj_Latn/taxi1500/*.arrow - config_name: shp_Latn data_files: - split: taxi1500 path: shp_Latn/taxi1500/*.arrow - config_name: sim_Latn data_files: - split: taxi1500 path: sim_Latn/taxi1500/*.arrow - config_name: sja_Latn data_files: - split: taxi1500 path: sja_Latn/taxi1500/*.arrow - config_name: sll_Latn data_files: - split: taxi1500 path: sll_Latn/taxi1500/*.arrow - config_name: smk_Latn data_files: - split: taxi1500 path: smk_Latn/taxi1500/*.arrow - config_name: sna_Latn data_files: - split: taxi1500 path: sna_Latn/taxi1500/*.arrow - config_name: snc_Latn data_files: - split: taxi1500 path: snc_Latn/taxi1500/*.arrow - config_name: snn_Latn data_files: - split: taxi1500 path: snn_Latn/taxi1500/*.arrow - config_name: snp_Latn data_files: - split: taxi1500 path: snp_Latn/taxi1500/*.arrow - config_name: snx_Latn data_files: - split: taxi1500 path: snx_Latn/taxi1500/*.arrow - config_name: sny_Latn data_files: - split: taxi1500 path: sny_Latn/taxi1500/*.arrow - config_name: som_Latn data_files: - split: taxi1500 path: som_Latn/taxi1500/*.arrow - config_name: soq_Latn data_files: - split: taxi1500 path: soq_Latn/taxi1500/*.arrow - config_name: soy_Latn data_files: - split: taxi1500 path: soy_Latn/taxi1500/*.arrow - config_name: spa_Latn data_files: - split: taxi1500 path: spa_Latn/taxi1500/*.arrow - config_name: spl_Latn data_files: - split: taxi1500 path: spl_Latn/taxi1500/*.arrow - config_name: spm_Latn data_files: - split: taxi1500 path: spm_Latn/taxi1500/*.arrow - config_name: spp_Latn data_files: - split: taxi1500 path: spp_Latn/taxi1500/*.arrow - config_name: sps_Latn data_files: - split: taxi1500 path: sps_Latn/taxi1500/*.arrow - config_name: spy_Latn data_files: - split: taxi1500 path: spy_Latn/taxi1500/*.arrow - config_name: sqi_Latn data_files: - split: taxi1500 path: sqi_Latn/taxi1500/*.arrow - config_name: sri_Latn data_files: - split: taxi1500 path: sri_Latn/taxi1500/*.arrow - config_name: srm_Latn data_files: - split: taxi1500 path: srm_Latn/taxi1500/*.arrow - config_name: srn_Latn data_files: - split: taxi1500 path: srn_Latn/taxi1500/*.arrow - config_name: srp_Latn data_files: - split: taxi1500 path: srp_Latn/taxi1500/*.arrow - config_name: srq_Latn data_files: - split: taxi1500 path: srq_Latn/taxi1500/*.arrow - config_name: ssd_Latn data_files: - split: taxi1500 path: ssd_Latn/taxi1500/*.arrow - config_name: ssg_Latn data_files: - split: taxi1500 path: ssg_Latn/taxi1500/*.arrow - config_name: ssx_Latn data_files: - split: taxi1500 path: ssx_Latn/taxi1500/*.arrow - config_name: stp_Latn data_files: - split: taxi1500 path: stp_Latn/taxi1500/*.arrow - config_name: sua_Latn data_files: - split: taxi1500 path: sua_Latn/taxi1500/*.arrow - config_name: sue_Latn data_files: - split: taxi1500 path: sue_Latn/taxi1500/*.arrow - config_name: sus_Arab data_files: - split: taxi1500 path: sus_Arab/taxi1500/*.arrow - config_name: sus_Latn data_files: - split: taxi1500 path: sus_Latn/taxi1500/*.arrow - config_name: suz_Deva data_files: - split: taxi1500 path: suz_Deva/taxi1500/*.arrow - config_name: swe_Latn data_files: - split: taxi1500 path: swe_Latn/taxi1500/*.arrow - config_name: swh_Latn data_files: - split: taxi1500 path: swh_Latn/taxi1500/*.arrow - config_name: swp_Latn data_files: - split: taxi1500 path: swp_Latn/taxi1500/*.arrow - config_name: sxb_Latn data_files: - split: taxi1500 path: sxb_Latn/taxi1500/*.arrow - config_name: tac_Latn data_files: - split: taxi1500 path: tac_Latn/taxi1500/*.arrow - config_name: taj_Deva data_files: - split: taxi1500 path: taj_Deva/taxi1500/*.arrow - config_name: tam_Taml data_files: - split: taxi1500 path: tam_Taml/taxi1500/*.arrow - config_name: tar_Latn data_files: - split: taxi1500 path: tar_Latn/taxi1500/*.arrow - config_name: tav_Latn data_files: - split: taxi1500 path: tav_Latn/taxi1500/*.arrow - config_name: taw_Latn data_files: - split: taxi1500 path: taw_Latn/taxi1500/*.arrow - config_name: tbc_Latn data_files: - split: taxi1500 path: tbc_Latn/taxi1500/*.arrow - config_name: tbf_Latn data_files: - split: taxi1500 path: tbf_Latn/taxi1500/*.arrow - config_name: tbg_Latn data_files: - split: taxi1500 path: tbg_Latn/taxi1500/*.arrow - config_name: tbk_Latn data_files: - split: taxi1500 path: tbk_Latn/taxi1500/*.arrow - config_name: tbl_Latn data_files: - split: taxi1500 path: tbl_Latn/taxi1500/*.arrow - config_name: tbo_Latn data_files: - split: taxi1500 path: tbo_Latn/taxi1500/*.arrow - config_name: tbz_Latn data_files: - split: taxi1500 path: tbz_Latn/taxi1500/*.arrow - config_name: tca_Latn data_files: - split: taxi1500 path: tca_Latn/taxi1500/*.arrow - config_name: tcs_Latn data_files: - split: taxi1500 path: tcs_Latn/taxi1500/*.arrow - config_name: tcz_Latn data_files: - split: taxi1500 path: tcz_Latn/taxi1500/*.arrow - config_name: tdt_Latn data_files: - split: taxi1500 path: tdt_Latn/taxi1500/*.arrow - config_name: tdx_Latn data_files: - split: taxi1500 path: tdx_Latn/taxi1500/*.arrow - config_name: tee_Latn data_files: - split: taxi1500 path: tee_Latn/taxi1500/*.arrow - config_name: tel_Telu data_files: - split: taxi1500 path: tel_Telu/taxi1500/*.arrow - config_name: ter_Latn data_files: - split: taxi1500 path: ter_Latn/taxi1500/*.arrow - config_name: tet_Latn data_files: - split: taxi1500 path: tet_Latn/taxi1500/*.arrow - config_name: tew_Latn data_files: - split: taxi1500 path: tew_Latn/taxi1500/*.arrow - config_name: tfr_Latn data_files: - split: taxi1500 path: tfr_Latn/taxi1500/*.arrow - config_name: tgj_Latn data_files: - split: taxi1500 path: tgj_Latn/taxi1500/*.arrow - config_name: tgk_Cyrl data_files: - split: taxi1500 path: tgk_Cyrl/taxi1500/*.arrow - config_name: tgl_Latn data_files: - split: taxi1500 path: tgl_Latn/taxi1500/*.arrow - config_name: tgo_Latn data_files: - split: taxi1500 path: tgo_Latn/taxi1500/*.arrow - config_name: tgp_Latn data_files: - split: taxi1500 path: tgp_Latn/taxi1500/*.arrow - config_name: tha_Thai data_files: - split: taxi1500 path: tha_Thai/taxi1500/*.arrow - config_name: thd_Latn data_files: - split: taxi1500 path: thd_Latn/taxi1500/*.arrow - config_name: tif_Latn data_files: - split: taxi1500 path: tif_Latn/taxi1500/*.arrow - config_name: tim_Latn data_files: - split: taxi1500 path: tim_Latn/taxi1500/*.arrow - config_name: tiw_Latn data_files: - split: taxi1500 path: tiw_Latn/taxi1500/*.arrow - config_name: tiy_Latn data_files: - split: taxi1500 path: tiy_Latn/taxi1500/*.arrow - config_name: tke_Latn data_files: - split: taxi1500 path: tke_Latn/taxi1500/*.arrow - config_name: tkr_Latn data_files: - split: taxi1500 path: tkr_Latn/taxi1500/*.arrow - config_name: tku_Latn data_files: - split: taxi1500 path: tku_Latn/taxi1500/*.arrow - config_name: tlf_Latn data_files: - split: taxi1500 path: tlf_Latn/taxi1500/*.arrow - config_name: tmd_Latn data_files: - split: taxi1500 path: tmd_Latn/taxi1500/*.arrow - config_name: tna_Latn data_files: - split: taxi1500 path: tna_Latn/taxi1500/*.arrow - config_name: tnc_Latn data_files: - split: taxi1500 path: tnc_Latn/taxi1500/*.arrow - config_name: tnk_Latn data_files: - split: taxi1500 path: tnk_Latn/taxi1500/*.arrow - config_name: tnn_Latn data_files: - split: taxi1500 path: tnn_Latn/taxi1500/*.arrow - config_name: tnp_Latn data_files: - split: taxi1500 path: tnp_Latn/taxi1500/*.arrow - config_name: toc_Latn data_files: - split: taxi1500 path: toc_Latn/taxi1500/*.arrow - config_name: tod_Latn data_files: - split: taxi1500 path: tod_Latn/taxi1500/*.arrow - config_name: tof_Latn data_files: - split: taxi1500 path: tof_Latn/taxi1500/*.arrow - config_name: toj_Latn data_files: - split: taxi1500 path: toj_Latn/taxi1500/*.arrow - config_name: ton_Latn data_files: - split: taxi1500 path: ton_Latn/taxi1500/*.arrow - config_name: too_Latn data_files: - split: taxi1500 path: too_Latn/taxi1500/*.arrow - config_name: top_Latn data_files: - split: taxi1500 path: top_Latn/taxi1500/*.arrow - config_name: tos_Latn data_files: - split: taxi1500 path: tos_Latn/taxi1500/*.arrow - config_name: tpa_Latn data_files: - split: taxi1500 path: tpa_Latn/taxi1500/*.arrow - config_name: tpi_Latn data_files: - split: taxi1500 path: tpi_Latn/taxi1500/*.arrow - config_name: tpt_Latn data_files: - split: taxi1500 path: tpt_Latn/taxi1500/*.arrow - config_name: tpz_Latn data_files: - split: taxi1500 path: tpz_Latn/taxi1500/*.arrow - config_name: trc_Latn data_files: - split: taxi1500 path: trc_Latn/taxi1500/*.arrow - config_name: tsn_Latn data_files: - split: taxi1500 path: tsn_Latn/taxi1500/*.arrow - config_name: tsw_Latn data_files: - split: taxi1500 path: tsw_Latn/taxi1500/*.arrow - config_name: ttc_Latn data_files: - split: taxi1500 path: ttc_Latn/taxi1500/*.arrow - config_name: tte_Latn data_files: - split: taxi1500 path: tte_Latn/taxi1500/*.arrow - config_name: tuc_Latn data_files: - split: taxi1500 path: tuc_Latn/taxi1500/*.arrow - config_name: tue_Latn data_files: - split: taxi1500 path: tue_Latn/taxi1500/*.arrow - config_name: tuf_Latn data_files: - split: taxi1500 path: tuf_Latn/taxi1500/*.arrow - config_name: tuo_Latn data_files: - split: taxi1500 path: tuo_Latn/taxi1500/*.arrow - config_name: tvk_Latn data_files: - split: taxi1500 path: tvk_Latn/taxi1500/*.arrow - config_name: tvt_Latn data_files: - split: taxi1500 path: tvt_Latn/taxi1500/*.arrow - config_name: twi_Latn data_files: - split: taxi1500 path: twi_Latn/taxi1500/*.arrow - config_name: txq_Latn data_files: - split: taxi1500 path: txq_Latn/taxi1500/*.arrow - config_name: txu_Latn data_files: - split: taxi1500 path: txu_Latn/taxi1500/*.arrow - config_name: tzj_Latn data_files: - split: taxi1500 path: tzj_Latn/taxi1500/*.arrow - config_name: tzo_Latn data_files: - split: taxi1500 path: tzo_Latn/taxi1500/*.arrow - config_name: ubr_Latn data_files: - split: taxi1500 path: ubr_Latn/taxi1500/*.arrow - config_name: ubu_Latn data_files: - split: taxi1500 path: ubu_Latn/taxi1500/*.arrow - config_name: udu_Latn data_files: - split: taxi1500 path: udu_Latn/taxi1500/*.arrow - config_name: uig_Arab data_files: - split: taxi1500 path: uig_Arab/taxi1500/*.arrow - config_name: uig_Cyrl data_files: - split: taxi1500 path: uig_Cyrl/taxi1500/*.arrow - config_name: uig_Latn data_files: - split: taxi1500 path: uig_Latn/taxi1500/*.arrow - config_name: ukr_Cyrl data_files: - split: taxi1500 path: ukr_Cyrl/taxi1500/*.arrow - config_name: uli_Latn data_files: - split: taxi1500 path: uli_Latn/taxi1500/*.arrow - config_name: ulk_Latn data_files: - split: taxi1500 path: ulk_Latn/taxi1500/*.arrow - config_name: unx_Orya data_files: - split: taxi1500 path: unx_Orya/taxi1500/*.arrow - config_name: upv_Latn data_files: - split: taxi1500 path: upv_Latn/taxi1500/*.arrow - config_name: ura_Latn data_files: - split: taxi1500 path: ura_Latn/taxi1500/*.arrow - config_name: urb_Latn data_files: - split: taxi1500 path: urb_Latn/taxi1500/*.arrow - config_name: urd_Arab data_files: - split: taxi1500 path: urd_Arab/taxi1500/*.arrow - config_name: urd_Deva data_files: - split: taxi1500 path: urd_Deva/taxi1500/*.arrow - config_name: urd_Latn data_files: - split: taxi1500 path: urd_Latn/taxi1500/*.arrow - config_name: uri_Latn data_files: - split: taxi1500 path: uri_Latn/taxi1500/*.arrow - config_name: urt_Latn data_files: - split: taxi1500 path: urt_Latn/taxi1500/*.arrow - config_name: urw_Latn data_files: - split: taxi1500 path: urw_Latn/taxi1500/*.arrow - config_name: usa_Latn data_files: - split: taxi1500 path: usa_Latn/taxi1500/*.arrow - config_name: usp_Latn data_files: - split: taxi1500 path: usp_Latn/taxi1500/*.arrow - config_name: uvh_Latn data_files: - split: taxi1500 path: uvh_Latn/taxi1500/*.arrow - config_name: uvl_Latn data_files: - split: taxi1500 path: uvl_Latn/taxi1500/*.arrow - config_name: vid_Latn data_files: - split: taxi1500 path: vid_Latn/taxi1500/*.arrow - config_name: vie_Latn data_files: - split: taxi1500 path: vie_Latn/taxi1500/*.arrow - config_name: viv_Latn data_files: - split: taxi1500 path: viv_Latn/taxi1500/*.arrow - config_name: vmy_Latn data_files: - split: taxi1500 path: vmy_Latn/taxi1500/*.arrow - config_name: waj_Latn data_files: - split: taxi1500 path: waj_Latn/taxi1500/*.arrow - config_name: wal_Latn data_files: - split: taxi1500 path: wal_Latn/taxi1500/*.arrow - config_name: wap_Latn data_files: - split: taxi1500 path: wap_Latn/taxi1500/*.arrow - config_name: wat_Latn data_files: - split: taxi1500 path: wat_Latn/taxi1500/*.arrow - config_name: wbi_Latn data_files: - split: taxi1500 path: wbi_Latn/taxi1500/*.arrow - config_name: wbp_Latn data_files: - split: taxi1500 path: wbp_Latn/taxi1500/*.arrow - config_name: wed_Latn data_files: - split: taxi1500 path: wed_Latn/taxi1500/*.arrow - config_name: wer_Latn data_files: - split: taxi1500 path: wer_Latn/taxi1500/*.arrow - config_name: wim_Latn data_files: - split: taxi1500 path: wim_Latn/taxi1500/*.arrow - config_name: wiu_Latn data_files: - split: taxi1500 path: wiu_Latn/taxi1500/*.arrow - config_name: wiv_Latn data_files: - split: taxi1500 path: wiv_Latn/taxi1500/*.arrow - config_name: wlg_Latn data_files: - split: taxi1500 path: wlg_Latn/taxi1500/*.arrow - config_name: wmt_Latn data_files: - split: taxi1500 path: wmt_Latn/taxi1500/*.arrow - config_name: wmw_Latn data_files: - split: taxi1500 path: wmw_Latn/taxi1500/*.arrow - config_name: wnc_Latn data_files: - split: taxi1500 path: wnc_Latn/taxi1500/*.arrow - config_name: wnu_Latn data_files: - split: taxi1500 path: wnu_Latn/taxi1500/*.arrow - config_name: wol_Latn data_files: - split: taxi1500 path: wol_Latn/taxi1500/*.arrow - config_name: wos_Latn data_files: - split: taxi1500 path: wos_Latn/taxi1500/*.arrow - config_name: wrk_Latn data_files: - split: taxi1500 path: wrk_Latn/taxi1500/*.arrow - config_name: wro_Latn data_files: - split: taxi1500 path: wro_Latn/taxi1500/*.arrow - config_name: wrs_Latn data_files: - split: taxi1500 path: wrs_Latn/taxi1500/*.arrow - config_name: wsk_Latn data_files: - split: taxi1500 path: wsk_Latn/taxi1500/*.arrow - config_name: wuv_Latn data_files: - split: taxi1500 path: wuv_Latn/taxi1500/*.arrow - config_name: xav_Latn data_files: - split: taxi1500 path: xav_Latn/taxi1500/*.arrow - config_name: xbi_Latn data_files: - split: taxi1500 path: xbi_Latn/taxi1500/*.arrow - config_name: xed_Latn data_files: - split: taxi1500 path: xed_Latn/taxi1500/*.arrow - config_name: xla_Latn data_files: - split: taxi1500 path: xla_Latn/taxi1500/*.arrow - config_name: xnj_Latn data_files: - split: taxi1500 path: xnj_Latn/taxi1500/*.arrow - config_name: xnn_Latn data_files: - split: taxi1500 path: xnn_Latn/taxi1500/*.arrow - config_name: xon_Latn data_files: - split: taxi1500 path: xon_Latn/taxi1500/*.arrow - config_name: xsi_Latn data_files: - split: taxi1500 path: xsi_Latn/taxi1500/*.arrow - config_name: xtd_Latn data_files: - split: taxi1500 path: xtd_Latn/taxi1500/*.arrow - config_name: xtm_Latn data_files: - split: taxi1500 path: xtm_Latn/taxi1500/*.arrow - config_name: yaa_Latn data_files: - split: taxi1500 path: yaa_Latn/taxi1500/*.arrow - config_name: yad_Latn data_files: - split: taxi1500 path: yad_Latn/taxi1500/*.arrow - config_name: yal_Latn data_files: - split: taxi1500 path: yal_Latn/taxi1500/*.arrow - config_name: yao_Latn data_files: - split: taxi1500 path: yao_Latn/taxi1500/*.arrow - config_name: yap_Latn data_files: - split: taxi1500 path: yap_Latn/taxi1500/*.arrow - config_name: yaq_Latn data_files: - split: taxi1500 path: yaq_Latn/taxi1500/*.arrow - config_name: yby_Latn data_files: - split: taxi1500 path: yby_Latn/taxi1500/*.arrow - config_name: ycn_Latn data_files: - split: taxi1500 path: ycn_Latn/taxi1500/*.arrow - config_name: yij_Latn data_files: - split: taxi1500 path: yij_Latn/taxi1500/*.arrow - config_name: yka_Latn data_files: - split: taxi1500 path: yka_Latn/taxi1500/*.arrow - config_name: yle_Latn data_files: - split: taxi1500 path: yle_Latn/taxi1500/*.arrow - config_name: yml_Latn data_files: - split: taxi1500 path: yml_Latn/taxi1500/*.arrow - config_name: yom_Latn data_files: - split: taxi1500 path: yom_Latn/taxi1500/*.arrow - config_name: yon_Latn data_files: - split: taxi1500 path: yon_Latn/taxi1500/*.arrow - config_name: yor_Latn data_files: - split: taxi1500 path: yor_Latn/taxi1500/*.arrow - config_name: yrb_Latn data_files: - split: taxi1500 path: yrb_Latn/taxi1500/*.arrow - config_name: yre_Latn data_files: - split: taxi1500 path: yre_Latn/taxi1500/*.arrow - config_name: yss_Latn data_files: - split: taxi1500 path: yss_Latn/taxi1500/*.arrow - config_name: yuj_Latn data_files: - split: taxi1500 path: yuj_Latn/taxi1500/*.arrow - config_name: yut_Latn data_files: - split: taxi1500 path: yut_Latn/taxi1500/*.arrow - config_name: yuw_Latn data_files: - split: taxi1500 path: yuw_Latn/taxi1500/*.arrow - config_name: yva_Latn data_files: - split: taxi1500 path: yva_Latn/taxi1500/*.arrow - config_name: zaa_Latn data_files: - split: taxi1500 path: zaa_Latn/taxi1500/*.arrow - config_name: zab_Latn data_files: - split: taxi1500 path: zab_Latn/taxi1500/*.arrow - config_name: zac_Latn data_files: - split: taxi1500 path: zac_Latn/taxi1500/*.arrow - config_name: zad_Latn data_files: - split: taxi1500 path: zad_Latn/taxi1500/*.arrow - config_name: zai_Latn data_files: - split: taxi1500 path: zai_Latn/taxi1500/*.arrow - config_name: zaj_Latn data_files: - split: taxi1500 path: zaj_Latn/taxi1500/*.arrow - config_name: zam_Latn data_files: - split: taxi1500 path: zam_Latn/taxi1500/*.arrow - config_name: zao_Latn data_files: - split: taxi1500 path: zao_Latn/taxi1500/*.arrow - config_name: zap_Latn data_files: - split: taxi1500 path: zap_Latn/taxi1500/*.arrow - config_name: zar_Latn data_files: - split: taxi1500 path: zar_Latn/taxi1500/*.arrow - config_name: zas_Latn data_files: - split: taxi1500 path: zas_Latn/taxi1500/*.arrow - config_name: zat_Latn data_files: - split: taxi1500 path: zat_Latn/taxi1500/*.arrow - config_name: zav_Latn data_files: - split: taxi1500 path: zav_Latn/taxi1500/*.arrow - config_name: zaw_Latn data_files: - split: taxi1500 path: zaw_Latn/taxi1500/*.arrow - config_name: zca_Latn data_files: - split: taxi1500 path: zca_Latn/taxi1500/*.arrow - config_name: zga_Latn data_files: - split: taxi1500 path: zga_Latn/taxi1500/*.arrow - config_name: zho_Hani data_files: - split: taxi1500 path: zho_Hani/taxi1500/*.arrow - config_name: zia_Latn data_files: - split: taxi1500 path: zia_Latn/taxi1500/*.arrow - config_name: ziw_Latn data_files: - split: taxi1500 path: ziw_Latn/taxi1500/*.arrow - config_name: zlm_Latn data_files: - split: taxi1500 path: zlm_Latn/taxi1500/*.arrow - config_name: zos_Latn data_files: - split: taxi1500 path: zos_Latn/taxi1500/*.arrow - config_name: zpc_Latn data_files: - split: taxi1500 path: zpc_Latn/taxi1500/*.arrow - config_name: zpi_Latn data_files: - split: taxi1500 path: zpi_Latn/taxi1500/*.arrow - config_name: zpl_Latn data_files: - split: taxi1500 path: zpl_Latn/taxi1500/*.arrow - config_name: zpm_Latn data_files: - split: taxi1500 path: zpm_Latn/taxi1500/*.arrow - config_name: zpo_Latn data_files: - split: taxi1500 path: zpo_Latn/taxi1500/*.arrow - config_name: zpq_Latn data_files: - split: taxi1500 path: zpq_Latn/taxi1500/*.arrow - config_name: zpu_Latn data_files: - split: taxi1500 path: zpu_Latn/taxi1500/*.arrow - config_name: zpv_Latn data_files: - split: taxi1500 path: zpv_Latn/taxi1500/*.arrow - config_name: zpz_Latn data_files: - split: taxi1500 path: zpz_Latn/taxi1500/*.arrow - config_name: zsm_Latn data_files: - split: taxi1500 path: zsm_Latn/taxi1500/*.arrow - config_name: zsr_Latn data_files: - split: taxi1500 path: zsr_Latn/taxi1500/*.arrow - config_name: ztq_Latn data_files: - split: taxi1500 path: ztq_Latn/taxi1500/*.arrow - config_name: zty_Latn data_files: - split: taxi1500 path: zty_Latn/taxi1500/*.arrow - config_name: zyp_Latn data_files: - split: taxi1500 path: zyp_Latn/taxi1500/*.arrow language: - asm - sqi - txq - mpm - qxn - lac - qxo - kaq - mbj - gym - sps - lbm - noa - kgf - aii - wer - zaj - mna - cbu - mcb - xnn - cnl - eko - pol - pjt - mkl - djj - chq - bjz - juy - car - kje - msb - sby - cpc - bhl - nde - mwc - mjc - awk - nhu - por - geb - omb - tbf - mps - ons - klt - spa - zsm - ron - kue - mic - dad - mbh - nld - zpl - nii - cek - kup - bzj - hop - att - tna - jvn - xla - cof - mih - bjr - dwr - zav - khz - tke - kdc - aui - tuc - mar - tew - bch - gmv - yre - aer - apn - pib - yao - cpa - nog - ksj - msc - bkx - yle - ubu - qvn - far - myu - ptu - poe - apw - beo - kwd - amu - huu - bon - mux - yka - wnu - wuv - cbc - bfz - imo - ghs - beu - hau - kud - kvg - mig - pls - cbv - pri - kjs - rmn - for - tim - tgl - apu - knj - lit - mxt - hwc - tca - qvc - hrv - maa - mcp - hus - toj - hbo - sja - kwf - bnp - leu - jiv - pir - mmo - glk - bgc - uvh - cbr - ton - gam - kqc - wiu - zca - top - atb - fin - nlg - kpf - lug - kyf - usa - kwj - sbd - jao - rug - yon - kpj - ood - kqw - msy - tkr - dgr - yaa - hix - acu - boa - peg - piu - kqa - kkl - mop - big - cjo - cpb - lgl - djr - shp - trc - myk - yml - mox - obo - ame - amp - cak - mbb - vid - ahr - aon - sua - azg - jid - qvh - mti - ura - hoy - ubr - zaa - qvw - tte - emp - ata - nag - rwo - ikk - nin - ngu - inb - mcd - ena - apy - fue - arn - mir - tel - tee - gum - tam - mxp - dak - gue - kan - xtm - cco - pon - bmr - azz - kkc - aly - gvn - lat - mpt - alp - dji - ebk - tha - amk - glv - sna - vie - yad - chz - mbt - cso - moh - spp - dwu - bqp - wed - adt - bsj - mto - lif - ian - enq - maz - aoi - ssx - nmw - bea - zam - kwi - gdn - cav - kbm - bjk - gof - tmd - bmu - cap - zar - dik - gnw - bmk - waj - pot - cth - txu - tet - poy - bre - cub - nab - jpn - cuc - aka - soy - yrb - wlg - kew - mwe - bjp - bhd - rai - tnp - dgc - tnc - bvr - hun - srq - mle - aai - ssd - cjv - wiv - cha - mbl - xtd - gla - ino - zad - tnk - nch - aoj - pan - twi - mks - tue - zga - yor - poh - stp - cym - cac - tif - lbb - mgw - xed - quf - meq - zyp - plt - kms - cni - tku - mcq - esk - snx - nhg - ceg - gah - guo - hlt - qve - sab - kik - cop - tuo - kze - nvm - ign - nif - cbk - kbq - nyu - agg - crx - qxh - uvl - mdy - sue - ksw - mgc - kfw - tsn - cme - nhi - klv - hvn - agr - qwh - cux - ikw - oji - akh - grc - got - kij - hui - reg - ksr - sbe - auc - heg - cya - haw - sbk - seh - maj - quw - als - yuj - fuh - mya - swe - mie - aaz - gyr - ncj - soq - ken - ptp - kyg - khs - zos - yby - lrg - kqf - kxv - kyq - tvt - amm - ckb - zlm - kql - gul - nob - ory - nys - bmh - wmw - gnn - miz - swh - zap - zpm - atd - nop - bla - isl - atg - cuk - too - ixl - box - mzz - gng - gux - hat - kos - rgu - tcs - tdx - lzh - yss - emi - sey - quc - qub - etr - agd - pma - otm - hns - kbh - lex - chd - hto - bki - pwg - ote - roo - alq - mqb - arb - cbt - mco - smk - ndg - msa - ong - aak - tsw - tgj - tzj - ape - rus - ziw - taw - ilo - cui - bef - zab - llg - rmc - wrs - mil - toc - cao - sgz - zas - kmh - nhe - kde - tod - urt - tar - bkq - are - gup - mva - xnj - tpa - tpi - wro - ztq - kyz - ceb - fil - hla - gaz - iws - nho - ben - urb - nuy - arp - dan - wnc - dob - mcf - gvc - kux - iou - ntj - ots - thd - wbp - ind - abx - awb - aey - bjv - otn - bbb - yal - tgk - bsp - bco - tbo - gui - sll - dww - gia - bdv - tnn - myy - snn - quh - cbi - tbc - jac - azb - kne - maq - mee - suz - wbi - nna - mkn - cnt - srn - opm - eri - aby - byr - dif - avt - faa - qvm - srp - gfk - bus - ibo - gvs - mpp - nlx - agn - kgk - agu - bgg - nnq - kpr - unx - wal - rmy - buk - cmn - knf - naf - cbs - luo - zpz - coe - ctu - mbc - met - mpj - mqj - amr - mav - omw - cta - dwy - nak - ter - xon - bpx - kpx - mph - aze - wat - ipi - bht - ekk - bkd - tiw - jae - anh - bhg - hin - muy - yuw - bss - cut - nas - sch - bdd - rmq - urd - uli - gai - guh - jic - kiz - kmu - sgb - bps - fuf - kjn - agm - mni - tvk - lcm - lin - pab - tos - zai - ngp - vmy - npl - gqr - bpr - cgc - heb - qul - okv - eus - otq - yij - mlh - caa - dah - ukr - nay - fra - pad - zaw - yut - hch - tlf - ded - rup - aau - zat - zia - sbs - sxb - kmk - viv - nou - wos - mau - zpc - mfy - wim - gwi - kto - amf - ces - ssg - mal - amo - ntu - ntp - hmo - acf - fai - cpy - auy - bgt - myw - san - tac - nbq - lww - msm - dhg - npi - tof - udu - qup - dso - kyc - djk - mkj - adz - mam - sny - rop - ttc - aso - mca - ruf - daa - bod - meu - amx - apb - cab - spm - agt - zpv - aom - nhw - mwf - shj - uri - gun - zsr - tpt - bzh - kbc - tuf - nfa - snc - nca - sri - acr - tcz - arz - kmg - taj - aia - mcr - mit - bbr - guj - spy - qvz - ctp - byx - nrf - mio - csy - uig - apr - sus - epo - zty - kky - ycn - nce - bzd - bqc - knv - kpw - ncl - prf - hub - zao - mmx - gaq - bsn - eng - ppo - zpo - lid - deu - abt - con - msk - xbi - enm - dop - row - nss - zpq - ndj - ncu - ake - tfr - wol - gub - blz - mxq - nno - sim - kca - wap - ese - jni - isn - bxh - rmb - bgs - gaw - kvn - nwi - bao - pio - nya - cwe - swp - kgp - awx - wmt - pah - usp - nhr - nko - hot - lbk - plu - mib - kdl - boj - not - cot - xav - kmo - wrk - zpi - btt - chk - ksd - tbg - dao - wsk - cle - tzo - yap - tav - clu - tiy - ktm - yom - kek - zac - mvn - snp - mgh - kpg - spl - ita - bwo - som - blw - dgz - zho - mek - tdt - huv - mpx - upv - tpz - kiw - rro - zpu - nlc - gdr - mlp - gvf - apz - srm - mwp - cax - dov - ewe - cpu - arl - rkb - tbl - amn - tgp - mxb - urw - pao - tbk - guc - yaq - poi - yva - ffm - ulk - xsi - chf - nhy - crn - caf - anv - bba - med - qvs - tgo - pes - bvd - mbs - nsn - tbz - aln tags: - multilingual pretty_name: Taxi1500 Corpus license: other license_name: license license_link: LICENSE --- # Taxi1500 Raw Data ## Introduction This repository contains the raw text data of the Taxi1500-c_v3.0 corpus, without classification labels and Bible verse ids. For the original Taxi1500 dataset for Text Classification, please refer to the [GitHub repository](https://github.com/cisnlp/Taxi1500/tree/main). The data format of the Taxi1500-RawData is identical to that of the [Glot500 Dataset](https://huggingface.co/datasets/cis-lmu/Glot500), facilitating seamless parallel utilization of both datasets. ## Usage Replace `acr_Latn` with your specific language. ```python from datasets import load_dataset dataset = load_dataset('cis-lmu/Taxi1500-RawData', 'acr_Latn', split='taxi1500') print(dataset[0]) # First row of acr_Latn ``` <details> <summary>Click to show supported language-script pairs:</summary> ``` aai_Latn aak_Latn aau_Latn aaz_Latn abt_Latn abx_Latn aby_Latn acf_Latn acr_Latn acu_Latn adt_Latn adz_Latn aer_Latn aey_Latn agd_Latn agg_Latn agm_Latn agn_Latn agr_Latn agt_Latn agu_Latn ahr_Deva aia_Latn aii_Syrc aka_Latn ake_Latn akh_Latn aln_Latn alp_Latn alq_Latn als_Latn aly_Latn ame_Latn amf_Latn amk_Latn amm_Latn amn_Latn amo_Latn amp_Latn amr_Latn amu_Latn amx_Latn anh_Latn anv_Latn aoi_Latn aoj_Latn aom_Latn aon_Latn apb_Latn ape_Latn apn_Latn apr_Latn apu_Latn apw_Latn apy_Latn apz_Latn arb_Arab are_Latn arl_Latn arn_Latn arp_Latn arz_Arab asm_Beng aso_Latn ata_Latn atb_Latn atd_Latn atg_Latn att_Latn auc_Latn aui_Latn auy_Latn avt_Latn awb_Latn awk_Latn awx_Latn azb_Latn aze_Latn azg_Latn azz_Latn bao_Latn bba_Latn bbb_Latn bbr_Latn bch_Latn bco_Latn bdd_Latn bdv_Orya bea_Latn bef_Latn ben_Beng beo_Latn beu_Latn bfz_Deva bgc_Deva bgg_Latn bgs_Latn bgt_Latn bhd_Deva bhg_Latn bhl_Latn bht_Deva big_Latn bjk_Latn bjp_Latn bjr_Latn bjv_Latn bjz_Latn bkd_Latn bki_Latn bkq_Latn bkx_Latn bla_Latn blw_Latn blz_Latn bmh_Latn bmk_Latn bmr_Latn bmu_Latn bnp_Latn boa_Latn bod_Tibt boj_Latn bon_Latn box_Latn bpr_Latn bps_Latn bpx_Deva bqc_Latn bqp_Latn bre_Latn bsj_Latn bsn_Latn bsp_Latn bss_Latn btt_Latn buk_Latn bus_Latn bvd_Latn bvr_Latn bwo_Latn bxh_Latn byr_Latn byx_Latn bzd_Latn bzh_Latn bzj_Latn caa_Latn cab_Latn cac_Latn caf_Latn cak_Latn cao_Latn cap_Latn car_Latn cav_Latn cax_Latn cbc_Latn cbi_Latn cbk_Latn cbr_Latn cbs_Latn cbt_Latn cbu_Latn cbv_Latn cco_Latn ceb_Latn ceg_Latn cek_Latn ces_Latn cgc_Latn cha_Latn chd_Latn chf_Latn chk_Latn chq_Latn chz_Latn cjo_Latn cjv_Latn ckb_Arab cle_Latn clu_Latn cme_Latn cmn_Hani cni_Latn cnl_Latn cnt_Latn coe_Latn cof_Latn con_Latn cop_Copt cot_Latn cpa_Latn cpb_Latn cpc_Latn cpu_Latn cpy_Latn crn_Latn crx_Latn cso_Latn csy_Latn cta_Latn cth_Latn ctp_Latn ctu_Latn cub_Latn cuc_Latn cui_Latn cuk_Latn cut_Latn cux_Latn cwe_Latn cya_Latn cym_Latn daa_Latn dad_Latn dah_Latn dak_Latn dan_Latn dao_Latn ded_Latn deu_Latn dgc_Latn dgr_Latn dgz_Latn dhg_Latn dif_Latn dik_Latn dji_Latn djj_Latn djk_Latn djr_Latn dob_Latn dop_Latn dov_Latn dso_Orya dwr_Ethi dwr_Latn dwu_Latn dww_Latn dwy_Latn ebk_Latn ekk_Latn eko_Latn emi_Latn emp_Latn ena_Latn eng_Latn enm_Latn enq_Latn epo_Latn eri_Latn ese_Latn esk_Latn etr_Latn eus_Latn ewe_Latn faa_Latn fai_Latn far_Latn ffm_Latn fil_Latn fin_Latn for_Latn fra_Latn fue_Latn fuf_Latn fuh_Latn gah_Latn gai_Latn gam_Latn gaq_Orya gaw_Latn gaz_Latn gdn_Latn gdr_Latn geb_Latn gfk_Latn ghs_Latn gia_Latn gla_Latn glk_Arab glv_Latn gmv_Ethi gmv_Latn gng_Latn gnn_Latn gnw_Latn gof_Ethi gof_Latn got_Latn gqr_Latn grc_Grek gub_Latn guc_Latn gue_Latn guh_Latn gui_Latn guj_Gujr gul_Latn gum_Latn gun_Latn guo_Latn gup_Latn gux_Latn gvc_Latn gvf_Latn gvn_Latn gvs_Latn gwi_Latn gym_Latn gyr_Latn hat_Latn hau_Latn haw_Latn hbo_Hebr hch_Latn heb_Hebr heg_Latn hin_Deva hix_Latn hla_Latn hlt_Latn hmo_Latn hns_Latn hop_Latn hot_Latn hoy_Deva hrv_Latn hto_Latn hub_Latn hui_Latn hun_Latn hus_Latn huu_Latn huv_Latn hvn_Latn hwc_Latn ian_Latn ibo_Latn ign_Latn ikk_Latn ikw_Latn ilo_Latn imo_Latn inb_Latn ind_Latn ino_Latn iou_Latn ipi_Latn isl_Latn isn_Latn ita_Latn iws_Latn ixl_Latn jac_Latn jae_Latn jao_Latn jic_Latn jid_Latn jiv_Latn jni_Latn jpn_Jpan juy_Orya jvn_Latn kan_Knda kan_Latn kaq_Latn kbc_Latn kbh_Latn kbm_Latn kbq_Latn kca_Cyrl kdc_Latn kde_Latn kdl_Latn kek_Latn ken_Latn kew_Latn kfw_Latn kgf_Latn kgk_Latn kgp_Latn khs_Latn khz_Latn kij_Latn kik_Latn kiw_Latn kiz_Latn kje_Latn kjn_Latn kjs_Latn kkc_Latn kkl_Latn kky_Latn klt_Latn klv_Latn kmg_Latn kmh_Latn kmk_Latn kmo_Latn kms_Latn kmu_Latn kne_Latn knf_Latn knj_Latn knv_Latn kos_Latn kpf_Latn kpg_Latn kpj_Latn kpr_Latn kpw_Latn kpx_Latn kqa_Latn kqc_Latn kqf_Latn kql_Latn kqw_Latn ksd_Latn ksj_Latn ksr_Latn ksw_Mymr ktm_Latn kto_Latn kud_Latn kue_Latn kup_Latn kux_Latn kvg_Latn kvn_Latn kwd_Latn kwf_Latn kwi_Latn kwj_Latn kxv_Orya kyc_Latn kyf_Latn kyg_Latn kyq_Latn kyz_Latn kze_Latn lac_Latn lat_Latn lbb_Latn lbk_Latn lbm_Deva lcm_Latn leu_Latn lex_Latn lgl_Latn lid_Latn lif_Deva lif_Limb lin_Latn lit_Latn llg_Latn lrg_Latn lug_Latn luo_Latn lww_Latn lzh_Hani maa_Latn maj_Latn mal_Mlym mam_Latn maq_Latn mar_Deva mau_Latn mav_Latn maz_Latn mbb_Latn mbc_Latn mbh_Latn mbj_Latn mbl_Latn mbs_Latn mbt_Latn mca_Latn mcb_Latn mcd_Latn mcf_Latn mco_Latn mcp_Latn mcq_Latn mcr_Latn mdy_Ethi med_Latn mee_Latn mek_Latn meq_Latn met_Latn meu_Latn mfy_Latn mgc_Latn mgh_Latn mgw_Latn mib_Latn mic_Latn mie_Latn mig_Latn mih_Latn mil_Latn mio_Latn mir_Latn mit_Latn miz_Latn mjc_Latn mkj_Latn mkl_Latn mkn_Latn mks_Latn mle_Latn mlh_Latn mlp_Latn mmo_Latn mmx_Latn mna_Latn mni_Latn moh_Latn mop_Latn mox_Latn mph_Latn mpj_Latn mpm_Latn mpp_Latn mps_Latn mpt_Latn mpx_Latn mqb_Latn mqj_Latn msa_Latn msb_Latn msc_Latn msk_Latn msm_Latn msy_Latn mti_Latn mto_Latn mux_Latn muy_Latn mva_Latn mvn_Latn mwc_Latn mwe_Latn mwf_Latn mwp_Latn mxb_Latn mxp_Latn mxq_Latn mxt_Latn mya_Mymr myk_Latn myu_Latn myw_Latn myy_Latn mzz_Latn nab_Latn naf_Latn nag_Latn nak_Latn nas_Latn nay_Latn nbq_Latn nca_Latn nce_Latn nch_Latn ncj_Latn ncl_Latn ncu_Latn nde_Latn ndg_Latn ndj_Latn nfa_Latn ngp_Latn ngu_Latn nhe_Latn nhg_Latn nhi_Latn nho_Latn nhr_Latn nhu_Latn nhw_Latn nhy_Latn nif_Latn nii_Latn nin_Latn nko_Latn nlc_Latn nld_Latn nlg_Latn nlx_Deva nmw_Latn nna_Latn nno_Latn nnq_Latn noa_Latn nob_Latn nog_Cyrl nop_Latn not_Latn nou_Latn npi_Deva npl_Latn nrf_Latn nsn_Latn nss_Latn ntj_Latn ntp_Latn ntu_Latn nuy_Latn nvm_Latn nwi_Latn nya_Latn nys_Latn nyu_Latn obo_Latn oji_Latn okv_Latn omb_Latn omw_Latn ong_Latn ons_Latn ood_Latn opm_Latn ory_Orya ote_Latn otm_Latn otn_Latn otq_Latn ots_Latn pab_Latn pad_Latn pah_Latn pan_Guru pao_Latn peg_Orya pes_Arab pib_Latn pio_Latn pir_Latn piu_Latn pjt_Latn pls_Latn plt_Latn plu_Latn pma_Latn poe_Latn poh_Latn poi_Latn pol_Latn pon_Latn por_Latn pot_Latn poy_Latn ppo_Latn prf_Latn pri_Latn ptp_Latn ptu_Latn pwg_Latn qub_Latn quc_Latn quf_Latn quh_Latn qul_Latn qup_Latn quw_Latn qvc_Latn qve_Latn qvh_Latn qvm_Latn qvn_Latn qvs_Latn qvw_Latn qvz_Latn qwh_Latn qxh_Latn qxn_Latn qxo_Latn rai_Latn reg_Latn rgu_Latn rkb_Latn rmb_Latn rmc_Cyrl rmc_Latn rmn_Cyrl rmn_Latn rmq_Latn rmy_Cyrl rmy_Latn ron_Cyrl ron_Latn roo_Latn rop_Latn row_Latn rro_Latn ruf_Latn rug_Latn rup_Latn rus_Cyrl rwo_Latn sab_Latn san_Arab san_Beng san_Deva san_Gujr san_Guru san_Khmr san_Knda san_Latn san_Mlym san_Mymr san_Orya san_Sinh san_Taml san_Telu san_Thai san_Tibt sbd_Latn sbe_Latn sbk_Latn sbs_Latn sby_Latn sch_Latn seh_Latn sey_Latn sgb_Latn sgz_Latn shj_Latn shp_Latn sim_Latn sja_Latn sll_Latn smk_Latn sna_Latn snc_Latn snn_Latn snp_Latn snx_Latn sny_Latn som_Latn soq_Latn soy_Latn spa_Latn spl_Latn spm_Latn spp_Latn sps_Latn spy_Latn sqi_Latn sri_Latn srm_Latn srn_Latn srp_Latn srq_Latn ssd_Latn ssg_Latn ssx_Latn stp_Latn sua_Latn sue_Latn sus_Arab sus_Latn suz_Deva swe_Latn swh_Latn swp_Latn sxb_Latn tac_Latn taj_Deva tam_Taml tar_Latn tav_Latn taw_Latn tbc_Latn tbf_Latn tbg_Latn tbk_Latn tbl_Latn tbo_Latn tbz_Latn tca_Latn tcs_Latn tcz_Latn tdt_Latn tdx_Latn tee_Latn tel_Telu ter_Latn tet_Latn tew_Latn tfr_Latn tgj_Latn tgk_Cyrl tgl_Latn tgo_Latn tgp_Latn tha_Thai thd_Latn tif_Latn tim_Latn tiw_Latn tiy_Latn tke_Latn tkr_Latn tku_Latn tlf_Latn tmd_Latn tna_Latn tnc_Latn tnk_Latn tnn_Latn tnp_Latn toc_Latn tod_Latn tof_Latn toj_Latn ton_Latn too_Latn top_Latn tos_Latn tpa_Latn tpi_Latn tpt_Latn tpz_Latn trc_Latn tsn_Latn tsw_Latn ttc_Latn tte_Latn tuc_Latn tue_Latn tuf_Latn tuo_Latn tvk_Latn tvt_Latn twi_Latn txq_Latn txu_Latn tzj_Latn tzo_Latn ubr_Latn ubu_Latn udu_Latn uig_Arab uig_Cyrl uig_Latn ukr_Cyrl uli_Latn ulk_Latn unx_Orya upv_Latn ura_Latn urb_Latn urd_Arab urd_Deva urd_Latn uri_Latn urt_Latn urw_Latn usa_Latn usp_Latn uvh_Latn uvl_Latn vid_Latn vie_Latn viv_Latn vmy_Latn waj_Latn wal_Latn wap_Latn wat_Latn wbi_Latn wbp_Latn wed_Latn wer_Latn wim_Latn wiu_Latn wiv_Latn wlg_Latn wmt_Latn wmw_Latn wnc_Latn wnu_Latn wol_Latn wos_Latn wrk_Latn wro_Latn wrs_Latn wsk_Latn wuv_Latn xav_Latn xbi_Latn xed_Latn xla_Latn xnj_Latn xnn_Latn xon_Latn xsi_Latn xtd_Latn xtm_Latn yaa_Latn yad_Latn yal_Latn yao_Latn yap_Latn yaq_Latn yby_Latn ycn_Latn yij_Latn yka_Latn yle_Latn yml_Latn yom_Latn yon_Latn yor_Latn yrb_Latn yre_Latn yss_Latn yuj_Latn yut_Latn yuw_Latn yva_Latn zaa_Latn zab_Latn zac_Latn zad_Latn zai_Latn zaj_Latn zam_Latn zao_Latn zap_Latn zar_Latn zas_Latn zat_Latn zav_Latn zaw_Latn zca_Latn zga_Latn zho_Hani zia_Latn ziw_Latn zlm_Latn zos_Latn zpc_Latn zpi_Latn zpl_Latn zpm_Latn zpo_Latn zpq_Latn zpu_Latn zpv_Latn zpz_Latn zsm_Latn zsr_Latn ztq_Latn zty_Latn zyp_Latn ``` </details> ## Citation If you use our work, please cite: ``` @misc{ma2023taxi1500, title={Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages}, author={Chunlan Ma and Ayyoob ImaniGooghari and Haotian Ye and Ehsaneddin Asgari and Hinrich Schütze}, year={2023}, eprint={2305.08487}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```

This dataset includes multiple language and script configurations, covering a wide range of languages and scripts from Latin to Arabic, Devanagari, Tibetan, Han, and Coptic scripts. Each configuration corresponds to a specific language code and script type, pointing to data files named taxi1500.
提供机构:
cis-lmu
原始信息汇总

数据集概述

数据集配置

本数据集包含多个配置,每个配置对应一组数据文件。每个配置都有一个唯一的名称,如aai_Latnaak_Latn等,并且每个配置下都有一个名为taxi1500的数据分割。

数据文件

每个配置下的数据文件路径遵循以下格式:

  • config_name/taxi1500/*.arrow

例如:

  • aai_Latn/taxi1500/*.arrow
  • aak_Latn/taxi1500/*.arrow
  • ...

这种结构表明,每个配置下的数据文件都是以.arrow格式存储的,且文件名可能包含通配符*,表示可能有多个文件。

数据集特点

  • 多语言支持:数据集涵盖了多种语言代码,如LatnDevaTibtArab等,表明数据集可能包含多种语言的数据。
  • 统一的数据格式:所有数据文件均采用.arrow格式,便于统一处理和分析。
  • 结构化存储:数据文件按照配置名称和数据分割进行组织,便于管理和检索。

结论

本数据集是一个结构化、多语言支持的数据集,适用于需要处理多种语言数据的研究和应用。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作