hotchpotch/nllb-english-bitext-hq
收藏Hugging Face2026-02-09 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/hotchpotch/nllb-english-bitext-hq
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: afr_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 477233974
num_examples: 2506253
download_size: 290735420
dataset_size: 477233974
- config_name: als_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1210602393
num_examples: 4308029
download_size: 765764482
dataset_size: 1210602393
- config_name: arb_Arab
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2768457648
num_examples: 7823346
download_size: 1584214889
dataset_size: 2768457648
- config_name: ast_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 55333464
num_examples: 262313
download_size: 34361201
dataset_size: 55333464
- config_name: azj_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 12590640
num_examples: 77813
download_size: 7082764
dataset_size: 12590640
- config_name: bel_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 42326374
num_examples: 241685
download_size: 23341136
dataset_size: 42326374
- config_name: ben_Beng
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 670551040
num_examples: 2107123
download_size: 325068908
dataset_size: 670551040
- config_name: bre_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 569188
num_examples: 4528
download_size: 265117
dataset_size: 569188
- config_name: bul_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1267898451
num_examples: 3972297
download_size: 699972419
dataset_size: 1267898451
- config_name: cat_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 791190425
num_examples: 3338666
download_size: 496493891
dataset_size: 791190425
- config_name: ceb_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 937574
num_examples: 6552
download_size: 438761
dataset_size: 937574
- config_name: ces_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2135352536
num_examples: 7464670
download_size: 1399675417
dataset_size: 2135352536
- config_name: dan_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2192292058
num_examples: 7721310
download_size: 1394265758
dataset_size: 2192292058
- config_name: deu_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2242544225
num_examples: 8820954
download_size: 1402305956
dataset_size: 2242544225
- config_name: ell_Grek
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2986465462
num_examples: 7171541
download_size: 1672115722
dataset_size: 2986465462
- config_name: epo_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1187167763
num_examples: 4119058
download_size: 790952966
dataset_size: 1187167763
- config_name: est_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 697640815
num_examples: 3185349
download_size: 437145330
dataset_size: 697640815
- config_name: eus_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 189097266
num_examples: 1091280
download_size: 113212880
dataset_size: 189097266
- config_name: fin_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1837298712
num_examples: 6805803
download_size: 1169313710
dataset_size: 1837298712
- config_name: fra_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2249351614
num_examples: 8779308
download_size: 1384434067
dataset_size: 2249351614
- config_name: fry_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 23262633
num_examples: 169720
download_size: 13458061
dataset_size: 23262633
- config_name: gla_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 135913
num_examples: 1377
download_size: 68049
dataset_size: 135913
- config_name: gle_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 231408603
num_examples: 1143612
download_size: 145652518
dataset_size: 231408603
- config_name: glg_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 491628308
num_examples: 2593749
download_size: 305335704
dataset_size: 491628308
- config_name: guj_Gujr
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 979318339
num_examples: 2853600
download_size: 486507148
dataset_size: 979318339
- config_name: hat_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 47577631
num_examples: 251480
download_size: 31231090
dataset_size: 47577631
- config_name: hau_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 13087999
num_examples: 47709
download_size: 3999205
dataset_size: 13087999
- config_name: heb_Hebr
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 778274594
num_examples: 3413852
download_size: 441801820
dataset_size: 778274594
- config_name: hin_Deva
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 791684606
num_examples: 2270876
download_size: 389714841
dataset_size: 791684606
- config_name: hrv_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 724348331
num_examples: 3146308
download_size: 463838503
dataset_size: 724348331
- config_name: hun_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1981517035
num_examples: 6850414
download_size: 1272401117
dataset_size: 1981517035
- config_name: hye_Armn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 327319828
num_examples: 1182510
download_size: 185169670
dataset_size: 327319828
- config_name: ibo_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 899001
num_examples: 6776
download_size: 398267
dataset_size: 899001
- config_name: ilo_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 856298
num_examples: 5850
download_size: 391329
dataset_size: 856298
- config_name: ind_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2170223836
num_examples: 8001918
download_size: 1318792596
dataset_size: 2170223836
- config_name: isl_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 325658994
num_examples: 1985550
download_size: 192271073
dataset_size: 325658994
- config_name: ita_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2308578213
num_examples: 8600910
download_size: 1447684117
dataset_size: 2308578213
- config_name: jav_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 222345070
num_examples: 1105178
download_size: 141293556
dataset_size: 222345070
- config_name: jpn_Jpan
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1143890941
num_examples: 4643143
download_size: 714286139
dataset_size: 1143890941
- config_name: kan_Knda
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1063420152
num_examples: 2949512
download_size: 511989897
dataset_size: 1063420152
- config_name: kat_Geor
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 282694914
num_examples: 955993
download_size: 136339893
dataset_size: 282694914
- config_name: kaz_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 371866101
num_examples: 1619591
download_size: 206717080
dataset_size: 371866101
- config_name: khk_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 216148331
num_examples: 1016298
download_size: 119945269
dataset_size: 216148331
- config_name: kin_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 24674479
num_examples: 144233
download_size: 16113209
dataset_size: 24674479
- config_name: kir_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 240056146
num_examples: 1104321
download_size: 135094711
dataset_size: 240056146
- config_name: kor_Hang
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 947828791
num_examples: 3500753
download_size: 601969723
dataset_size: 947828791
- config_name: lat_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 18604797
num_examples: 139856
download_size: 10722549
dataset_size: 18604797
- config_name: lit_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 767419886
num_examples: 3287855
download_size: 484049410
dataset_size: 767419886
- config_name: ltz_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 92352850
num_examples: 580215
download_size: 60413824
dataset_size: 92352850
- config_name: lvs_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 60662924
num_examples: 274868
download_size: 37055650
dataset_size: 60662924
- config_name: mal_Mlym
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 760287615
num_examples: 2583061
download_size: 356248841
dataset_size: 760287615
- config_name: mar_Deva
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 447639396
num_examples: 1935657
download_size: 219334638
dataset_size: 447639396
- config_name: mkd_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 934358063
num_examples: 3120744
download_size: 509341533
dataset_size: 934358063
- config_name: mlt_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 126380928
num_examples: 395657
download_size: 79372258
dataset_size: 126380928
- config_name: mya_Mymr
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 251978942
num_examples: 964409
download_size: 119394198
dataset_size: 251978942
- config_name: nld_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2318041902
num_examples: 8573018
download_size: 1456744386
dataset_size: 2318041902
- config_name: nob_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 767022603
num_examples: 3721296
download_size: 477746429
dataset_size: 767022603
- config_name: npi_Deva
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 393342106
num_examples: 1582342
download_size: 193645976
dataset_size: 393342106
- config_name: oci_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 31531530
num_examples: 251336
download_size: 17762288
dataset_size: 31531530
- config_name: ory_Orya
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 359123607
num_examples: 1460880
download_size: 174305202
dataset_size: 359123607
- config_name: pan_Guru
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 408233095
num_examples: 1412843
download_size: 204715210
dataset_size: 408233095
- config_name: pap_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 65776192
num_examples: 491934
download_size: 41868980
dataset_size: 65776192
- config_name: pbt_Arab
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 281695138
num_examples: 1144240
download_size: 160387676
dataset_size: 281695138
- config_name: pes_Arab
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2066973748
num_examples: 6209954
download_size: 1162742137
dataset_size: 2066973748
- config_name: plt_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 105068057
num_examples: 451126
download_size: 59577038
dataset_size: 105068057
- config_name: pol_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2310544716
num_examples: 8044358
download_size: 1495465655
dataset_size: 2310544716
- config_name: por_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2243948320
num_examples: 8643185
download_size: 1398264719
dataset_size: 2243948320
- config_name: ron_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1307862849
num_examples: 4378083
download_size: 853400687
dataset_size: 1307862849
- config_name: rus_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 3120481869
num_examples: 8486540
download_size: 1752229133
dataset_size: 3120481869
- config_name: sin_Sinh
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 445159841
num_examples: 1885142
download_size: 226866794
dataset_size: 445159841
- config_name: slk_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 791002686
num_examples: 3559252
download_size: 508050829
dataset_size: 791002686
- config_name: slv_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 701999384
num_examples: 3209953
download_size: 445484696
dataset_size: 701999384
- config_name: snd_Arab
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 50170771
num_examples: 177527
download_size: 22953989
dataset_size: 50170771
- config_name: som_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2288568
num_examples: 8729
download_size: 758503
dataset_size: 2288568
- config_name: spa_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2134491826
num_examples: 8795040
download_size: 1315072103
dataset_size: 2134491826
- config_name: srp_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 729103397
num_examples: 3193478
download_size: 444593119
dataset_size: 729103397
- config_name: sun_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 154176557
num_examples: 812042
download_size: 98511663
dataset_size: 154176557
- config_name: swe_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2247324775
num_examples: 8250514
download_size: 1419538631
dataset_size: 2247324775
- config_name: swh_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 317196552
num_examples: 1328160
download_size: 197940514
dataset_size: 317196552
- config_name: tam_Taml
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 759689430
num_examples: 2536971
download_size: 349800070
dataset_size: 759689430
- config_name: tat_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 87298087
num_examples: 493370
download_size: 49049276
dataset_size: 87298087
- config_name: tel_Telu
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 995445822
num_examples: 2540610
download_size: 479123861
dataset_size: 995445822
- config_name: tgk_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 76532053
num_examples: 347528
download_size: 42880033
dataset_size: 76532053
- config_name: tgl_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 579306357
num_examples: 2624002
download_size: 359367672
dataset_size: 579306357
- config_name: tur_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1729229613
num_examples: 6886531
download_size: 1097257410
dataset_size: 1729229613
- config_name: ukr_Cyrl
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1156172002
num_examples: 3740610
download_size: 656048070
dataset_size: 1156172002
- config_name: urd_Arab
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 601667955
num_examples: 2313544
download_size: 340835734
dataset_size: 601667955
- config_name: uzn_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 307952970
num_examples: 1428827
download_size: 190953047
dataset_size: 307952970
- config_name: vie_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 2225352917
num_examples: 7301273
download_size: 1313321524
dataset_size: 2225352917
- config_name: xho_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 34974
num_examples: 288
download_size: 19033
dataset_size: 34974
- config_name: ydd_Hebr
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 605968
num_examples: 5135
download_size: 257121
dataset_size: 605968
- config_name: yor_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 13321604
num_examples: 60034
download_size: 8509688
dataset_size: 13321604
- config_name: zho_Hans
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 1455622905
num_examples: 6386961
download_size: 1000638030
dataset_size: 1455622905
- config_name: zho_Hant
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 111538029
num_examples: 783217
download_size: 72847696
dataset_size: 111538029
- config_name: zsm_Latn
features:
- name: english
dtype: string
- name: translated
dtype: string
- name: reranker_score
dtype: float64
- name: original_index_id
dtype: int64
- name: score
dtype: float64
- name: laser_score
dtype: float64
splits:
- name: train
num_bytes: 925367494
num_examples: 3710177
download_size: 568278718
dataset_size: 925367494
configs:
- config_name: afr_Latn
data_files:
- split: train
path: afr_Latn/train-*
- config_name: als_Latn
data_files:
- split: train
path: als_Latn/train-*
- config_name: arb_Arab
data_files:
- split: train
path: arb_Arab/train-*
- config_name: ast_Latn
data_files:
- split: train
path: ast_Latn/train-*
- config_name: azj_Latn
data_files:
- split: train
path: azj_Latn/train-*
- config_name: bel_Cyrl
data_files:
- split: train
path: bel_Cyrl/train-*
- config_name: ben_Beng
data_files:
- split: train
path: ben_Beng/train-*
- config_name: bre_Latn
data_files:
- split: train
path: bre_Latn/train-*
- config_name: bul_Cyrl
data_files:
- split: train
path: bul_Cyrl/train-*
- config_name: cat_Latn
data_files:
- split: train
path: cat_Latn/train-*
- config_name: ceb_Latn
data_files:
- split: train
path: ceb_Latn/train-*
- config_name: ces_Latn
data_files:
- split: train
path: ces_Latn/train-*
- config_name: dan_Latn
data_files:
- split: train
path: dan_Latn/train-*
- config_name: deu_Latn
data_files:
- split: train
path: deu_Latn/train-*
- config_name: ell_Grek
data_files:
- split: train
path: ell_Grek/train-*
- config_name: epo_Latn
data_files:
- split: train
path: epo_Latn/train-*
- config_name: est_Latn
data_files:
- split: train
path: est_Latn/train-*
- config_name: eus_Latn
data_files:
- split: train
path: eus_Latn/train-*
- config_name: fin_Latn
data_files:
- split: train
path: fin_Latn/train-*
- config_name: fra_Latn
data_files:
- split: train
path: fra_Latn/train-*
- config_name: fry_Latn
data_files:
- split: train
path: fry_Latn/train-*
- config_name: gla_Latn
data_files:
- split: train
path: gla_Latn/train-*
- config_name: gle_Latn
data_files:
- split: train
path: gle_Latn/train-*
- config_name: glg_Latn
data_files:
- split: train
path: glg_Latn/train-*
- config_name: guj_Gujr
data_files:
- split: train
path: guj_Gujr/train-*
- config_name: hat_Latn
data_files:
- split: train
path: hat_Latn/train-*
- config_name: hau_Latn
data_files:
- split: train
path: hau_Latn/train-*
- config_name: heb_Hebr
data_files:
- split: train
path: heb_Hebr/train-*
- config_name: hin_Deva
data_files:
- split: train
path: hin_Deva/train-*
- config_name: hrv_Latn
data_files:
- split: train
path: hrv_Latn/train-*
- config_name: hun_Latn
data_files:
- split: train
path: hun_Latn/train-*
- config_name: hye_Armn
data_files:
- split: train
path: hye_Armn/train-*
- config_name: ibo_Latn
data_files:
- split: train
path: ibo_Latn/train-*
- config_name: ilo_Latn
data_files:
- split: train
path: ilo_Latn/train-*
- config_name: ind_Latn
data_files:
- split: train
path: ind_Latn/train-*
- config_name: isl_Latn
data_files:
- split: train
path: isl_Latn/train-*
- config_name: ita_Latn
data_files:
- split: train
path: ita_Latn/train-*
- config_name: jav_Latn
data_files:
- split: train
path: jav_Latn/train-*
- config_name: jpn_Jpan
data_files:
- split: train
path: jpn_Jpan/train-*
- config_name: kan_Knda
data_files:
- split: train
path: kan_Knda/train-*
- config_name: kat_Geor
data_files:
- split: train
path: kat_Geor/train-*
- config_name: kaz_Cyrl
data_files:
- split: train
path: kaz_Cyrl/train-*
- config_name: khk_Cyrl
data_files:
- split: train
path: khk_Cyrl/train-*
- config_name: kin_Latn
data_files:
- split: train
path: kin_Latn/train-*
- config_name: kir_Cyrl
data_files:
- split: train
path: kir_Cyrl/train-*
- config_name: kor_Hang
data_files:
- split: train
path: kor_Hang/train-*
- config_name: lat_Latn
data_files:
- split: train
path: lat_Latn/train-*
- config_name: lit_Latn
data_files:
- split: train
path: lit_Latn/train-*
- config_name: ltz_Latn
data_files:
- split: train
path: ltz_Latn/train-*
- config_name: lvs_Latn
data_files:
- split: train
path: lvs_Latn/train-*
- config_name: mal_Mlym
data_files:
- split: train
path: mal_Mlym/train-*
- config_name: mar_Deva
data_files:
- split: train
path: mar_Deva/train-*
- config_name: mkd_Cyrl
data_files:
- split: train
path: mkd_Cyrl/train-*
- config_name: mlt_Latn
data_files:
- split: train
path: mlt_Latn/train-*
- config_name: mya_Mymr
data_files:
- split: train
path: mya_Mymr/train-*
- config_name: nld_Latn
data_files:
- split: train
path: nld_Latn/train-*
- config_name: nob_Latn
data_files:
- split: train
path: nob_Latn/train-*
- config_name: npi_Deva
data_files:
- split: train
path: npi_Deva/train-*
- config_name: oci_Latn
data_files:
- split: train
path: oci_Latn/train-*
- config_name: ory_Orya
data_files:
- split: train
path: ory_Orya/train-*
- config_name: pan_Guru
data_files:
- split: train
path: pan_Guru/train-*
- config_name: pap_Latn
data_files:
- split: train
path: pap_Latn/train-*
- config_name: pbt_Arab
data_files:
- split: train
path: pbt_Arab/train-*
- config_name: pes_Arab
data_files:
- split: train
path: pes_Arab/train-*
- config_name: plt_Latn
data_files:
- split: train
path: plt_Latn/train-*
- config_name: pol_Latn
data_files:
- split: train
path: pol_Latn/train-*
- config_name: por_Latn
data_files:
- split: train
path: por_Latn/train-*
- config_name: ron_Latn
data_files:
- split: train
path: ron_Latn/train-*
- config_name: rus_Cyrl
data_files:
- split: train
path: rus_Cyrl/train-*
- config_name: sin_Sinh
data_files:
- split: train
path: sin_Sinh/train-*
- config_name: slk_Latn
data_files:
- split: train
path: slk_Latn/train-*
- config_name: slv_Latn
data_files:
- split: train
path: slv_Latn/train-*
- config_name: snd_Arab
data_files:
- split: train
path: snd_Arab/train-*
- config_name: som_Latn
data_files:
- split: train
path: som_Latn/train-*
- config_name: spa_Latn
data_files:
- split: train
path: spa_Latn/train-*
- config_name: srp_Cyrl
data_files:
- split: train
path: srp_Cyrl/train-*
- config_name: sun_Latn
data_files:
- split: train
path: sun_Latn/train-*
- config_name: swe_Latn
data_files:
- split: train
path: swe_Latn/train-*
- config_name: swh_Latn
data_files:
- split: train
path: swh_Latn/train-*
- config_name: tam_Taml
data_files:
- split: train
path: tam_Taml/train-*
- config_name: tat_Cyrl
data_files:
- split: train
path: tat_Cyrl/train-*
- config_name: tel_Telu
data_files:
- split: train
path: tel_Telu/train-*
- config_name: tgk_Cyrl
data_files:
- split: train
path: tgk_Cyrl/train-*
- config_name: tgl_Latn
data_files:
- split: train
path: tgl_Latn/train-*
- config_name: tur_Latn
data_files:
- split: train
path: tur_Latn/train-*
- config_name: ukr_Cyrl
data_files:
- split: train
path: ukr_Cyrl/train-*
- config_name: urd_Arab
data_files:
- split: train
path: urd_Arab/train-*
- config_name: uzn_Latn
data_files:
- split: train
path: uzn_Latn/train-*
- config_name: vie_Latn
data_files:
- split: train
path: vie_Latn/train-*
- config_name: xho_Latn
data_files:
- split: train
path: xho_Latn/train-*
- config_name: ydd_Hebr
data_files:
- split: train
path: ydd_Hebr/train-*
- config_name: yor_Latn
data_files:
- split: train
path: yor_Latn/train-*
- config_name: zho_Hans
data_files:
- split: train
path: zho_Hans/train-*
- config_name: zho_Hant
data_files:
- split: train
path: zho_Hant/train-*
- config_name: zsm_Latn
data_files:
- split: train
path: zsm_Latn/train-*
---
# nllb-english-bitext-hq
🚧 This dataset is under active development and may change.
This dataset contains filtered English-non-English bitext from NLLB ([allenai/nllb](https://huggingface.co/datasets/allenai/nllb)) and CCMatrix.
It is intended for multilingual embedding training, reranker training, and cross-lingual retrieval experiments.
Each row has these fields:
- `english`: English sentence
- `translated`: Non-English sentence
- `reranker_score`: score from `BAAI/bge-reranker-v2-m3`
- `original_index_id`: row index in the source data
- `score`: embedding similarity score from BGE-m3
- `laser_score`: LASER score from source data
⚠️ Important note:
This dataset is filtered with BGE-m3 and `BAAI/bge-reranker-v2-m3`. That means model bias can affect what is kept. For languages where BGE-m3 or the reranker is less reliable, useful pairs may be dropped and the remaining distribution may shift.
## Dataset creation process (rough)
For very large source subsets, we first apply random sampling to cap the working set size before scoring and ranking.
All configs are produced with one rough pipeline:
- Text preprocessing
- `BAAI/bge-reranker-v2-m3` scoring and filtering
- BGE-m3 candidate ranking and Top-K sampling (`top-1` or `top-2`; smaller subsets use `top-2`)
- Near-duplicate cleanup (applied when needed)
## Language subsets
| Config | Rows | Source |
| --- | ---: | --- |
| arb_Arab | 7,823,346 | NLLB |
| ben_Beng | 2,107,123 | NLLB |
| ces_Latn | 7,464,670 | CCMatrix |
| dan_Latn | 7,721,310 | CCMatrix |
| deu_Latn | 8,820,954 | NLLB |
| ell_Grek | 7,171,541 | NLLB |
| afr_Latn | 2,506,253 | CCMatrix |
| als_Latn | 4,308,029 | NLLB |
| ast_Latn | 262,313 | CCMatrix |
| azj_Latn | 77,813 | CCMatrix |
| bel_Cyrl | 241,685 | CCMatrix |
| bre_Latn | 4,528 | CCMatrix |
| bul_Cyrl | 3,972,297 | CCMatrix |
| cat_Latn | 3,338,666 | CCMatrix |
| ceb_Latn | 6,552 | CCMatrix |
| epo_Latn | 4,119,058 | NLLB |
| est_Latn | 3,185,349 | CCMatrix |
| eus_Latn | 1,091,280 | CCMatrix |
| fry_Latn | 169,720 | CCMatrix |
| gla_Latn | 1,377 | CCMatrix |
| gle_Latn | 1,143,612 | NLLB |
| glg_Latn | 2,593,749 | CCMatrix |
| guj_Gujr | 2,853,600 | NLLB |
| hat_Latn | 251,480 | NLLB |
| hau_Latn | 47,709 | CCMatrix |
| heb_Hebr | 3,413,852 | CCMatrix |
| hrv_Latn | 3,146,308 | CCMatrix |
| hye_Armn | 1,182,510 | NLLB |
| ibo_Latn | 6,776 | CCMatrix |
| ilo_Latn | 5,850 | CCMatrix |
| isl_Latn | 1,985,550 | CCMatrix |
| jav_Latn | 1,105,178 | NLLB |
| kan_Knda | 2,949,512 | NLLB |
| kat_Geor | 955,993 | NLLB |
| kaz_Cyrl | 1,619,591 | NLLB |
| khk_Cyrl | 1,016,298 | NLLB |
| kin_Latn | 144,233 | NLLB |
| kir_Cyrl | 1,104,321 | NLLB |
| lat_Latn | 139,856 | CCMatrix |
| lit_Latn | 3,287,855 | CCMatrix |
| ltz_Latn | 580,215 | NLLB |
| lvs_Latn | 274,868 | CCMatrix |
| mal_Mlym | 2,583,061 | NLLB |
| mar_Deva | 1,935,657 | NLLB |
| mkd_Cyrl | 3,120,744 | CCMatrix |
| mlt_Latn | 395,657 | NLLB |
| mya_Mymr | 964,409 | NLLB |
| nob_Latn | 3,721,296 | CCMatrix |
| npi_Deva | 1,582,342 | NLLB |
| oci_Latn | 251,336 | CCMatrix |
| ory_Orya | 1,460,880 | NLLB |
| pan_Guru | 1,412,843 | NLLB |
| pap_Latn | 491,934 | NLLB |
| pbt_Arab | 1,144,240 | NLLB |
| plt_Latn | 451,126 | CCMatrix |
| ron_Latn | 4,378,083 | NLLB |
| sin_Sinh | 1,885,142 | NLLB |
| slk_Latn | 3,559,252 | CCMatrix |
| slv_Latn | 3,209,953 | CCMatrix |
| snd_Arab | 177,527 | CCMatrix |
| som_Latn | 8,729 | CCMatrix |
| srp_Cyrl | 3,193,478 | CCMatrix |
| sun_Latn | 812,042 | NLLB |
| tam_Taml | 2,536,971 | NLLB |
| tat_Cyrl | 493,370 | NLLB |
| tgk_Cyrl | 347,528 | NLLB |
| tgl_Latn | 2,624,002 | NLLB |
| ukr_Cyrl | 3,740,610 | CCMatrix |
| urd_Arab | 2,313,544 | NLLB |
| uzn_Latn | 1,428,827 | NLLB |
| xho_Latn | 288 | CCMatrix |
| ydd_Hebr | 5,135 | CCMatrix |
| zho_Hant | 783,217 | NLLB |
| zsm_Latn | 3,710,177 | NLLB |
| fin_Latn | 6,805,803 | CCMatrix |
| fra_Latn | 8,779,308 | NLLB |
| hin_Deva | 2,270,876 | NLLB |
| hun_Latn | 6,850,414 | CCMatrix |
| ind_Latn | 8,001,918 | NLLB |
| ita_Latn | 8,600,910 | NLLB |
| jpn_Jpan | 4,643,143 | CCMatrix |
| kor_Hang | 3,500,753 | CCMatrix |
| nld_Latn | 8,573,018 | NLLB |
| pes_Arab | 6,209,954 | CCMatrix |
| pol_Latn | 8,044,358 | NLLB |
| por_Latn | 8,643,185 | NLLB |
| rus_Cyrl | 8,486,540 | NLLB |
| spa_Latn | 8,795,040 | NLLB |
| swe_Latn | 8,250,514 | NLLB |
| swh_Latn | 1,328,160 | NLLB |
| tel_Telu | 2,540,610 | NLLB |
| tur_Latn | 6,886,531 | CCMatrix |
| vie_Latn | 7,301,273 | CCMatrix |
| yor_Latn | 60,034 | CCMatrix |
| zho_Hans | 6,386,961 | NLLB |
## License
- NLLB-derived subsets follow the NLLB dataset license (ODC-BY): https://huggingface.co/datasets/allenai/nllb
- CCMatrix-derived subsets follow CCMatrix/Common Crawl terms. Check upstream terms if you need strict license compliance.
## Citation and attribution
- For NLLB-derived subsets, cite the [NLLB paper](https://arxiv.org/abs/2207.04672)
- For CCMatrix-derived subsets, cite the [CCMatrix paper](https://arxiv.org/abs/1911.04944)
提供机构:
hotchpotch



