rayliuca/WikidataLabels
收藏Hugging Face2024-01-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/rayliuca/WikidataLabels
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc0-1.0
dataset_info:
- config_name: aa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13986211
num_examples: 436895
download_size: 9821312
dataset_size: 13986211
- config_name: ab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5012532
num_examples: 159908
download_size: 3013706
dataset_size: 5012532
- config_name: abs
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4252728
num_examples: 143986
download_size: 2567450
dataset_size: 4252728
- config_name: ace
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 19105673
num_examples: 574712
download_size: 13573374
dataset_size: 19105673
- config_name: ady
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4444259
num_examples: 148627
download_size: 2705754
dataset_size: 4444259
- config_name: ady-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4412556
num_examples: 147884
download_size: 2682170
dataset_size: 4412556
- config_name: aeb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4305734
num_examples: 145198
download_size: 2606368
dataset_size: 4305734
- config_name: aeb-arab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4467930
num_examples: 148796
download_size: 2722169
dataset_size: 4467930
- config_name: aeb-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12770359
num_examples: 404946
download_size: 8886489
dataset_size: 12770359
- config_name: af
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 58561042
num_examples: 1643153
download_size: 42539052
dataset_size: 58561042
- config_name: agq
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 1317
num_examples: 33
download_size: 2906
dataset_size: 1317
- config_name: ak
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14198715
num_examples: 443037
download_size: 9991525
dataset_size: 14198715
- config_name: aln
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13811116
num_examples: 432089
download_size: 9673418
dataset_size: 13811116
- config_name: als
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20691
num_examples: 543
download_size: 17540
dataset_size: 20691
- config_name: alt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 108390
num_examples: 1814
download_size: 59046
dataset_size: 108390
- config_name: am
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5231176
num_examples: 163038
download_size: 3187164
dataset_size: 5231176
- config_name: ami
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 21519
num_examples: 686
download_size: 16640
dataset_size: 21519
- config_name: an
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 240345072
num_examples: 5921087
download_size: 164895205
dataset_size: 240345072
- config_name: ang
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14275715
num_examples: 443461
download_size: 10063758
dataset_size: 14275715
- config_name: anp
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8558258
num_examples: 241612
download_size: 4381360
dataset_size: 8558258
- config_name: ar
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 291173732
num_examples: 5724064
download_size: 159369497
dataset_size: 291173732
- config_name: arc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4473283
num_examples: 150006
download_size: 2722619
dataset_size: 4473283
- config_name: arn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13879729
num_examples: 433912
download_size: 9715431
dataset_size: 13879729
- config_name: arq
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4346991
num_examples: 146004
download_size: 2636972
dataset_size: 4346991
- config_name: ary
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5358568
num_examples: 171568
download_size: 3313402
dataset_size: 5358568
- config_name: arz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 81806333
num_examples: 1669699
download_size: 49423508
dataset_size: 81806333
- config_name: as
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 21658610
num_examples: 450074
download_size: 9641626
dataset_size: 21658610
- config_name: ase
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4252943
num_examples: 143986
download_size: 2568106
dataset_size: 4252943
- config_name: ast
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 1385628786
num_examples: 20696237
download_size: 955908362
dataset_size: 1385628786
- config_name: atj
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12996229
num_examples: 411639
download_size: 9057557
dataset_size: 12996229
- config_name: av
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4722934
num_examples: 153781
download_size: 2880103
dataset_size: 4722934
- config_name: avk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13194485
num_examples: 414598
download_size: 9200917
dataset_size: 13194485
- config_name: awa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8599312
num_examples: 242320
download_size: 4411751
dataset_size: 8599312
- config_name: ay
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14269432
num_examples: 443521
download_size: 10029939
dataset_size: 14269432
- config_name: az
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 21049248
num_examples: 516732
download_size: 14117527
dataset_size: 21049248
- config_name: azb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 30781587
num_examples: 607562
download_size: 16028687
dataset_size: 30781587
- config_name: ba
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 11525351
num_examples: 261509
download_size: 6733777
dataset_size: 11525351
- config_name: ban
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13674052
num_examples: 426706
download_size: 9513747
dataset_size: 13674052
- config_name: ban-bali
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 50961
num_examples: 748
download_size: 25817
dataset_size: 50961
- config_name: bar
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 54783034
num_examples: 1566120
download_size: 40389830
dataset_size: 54783034
- config_name: bbc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12820895
num_examples: 406960
download_size: 8917054
dataset_size: 12820895
- config_name: bcc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8017228
num_examples: 241977
download_size: 4344579
dataset_size: 8017228
- config_name: be
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 30978832
num_examples: 564184
download_size: 17461174
dataset_size: 30978832
- config_name: be-tarask
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 18931909
num_examples: 374396
download_size: 10871239
dataset_size: 18931909
- config_name: bg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 200628708
num_examples: 4383953
download_size: 137745533
dataset_size: 200628708
- config_name: bgn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 7999280
num_examples: 241566
download_size: 4331249
dataset_size: 7999280
- config_name: bi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14040026
num_examples: 438382
download_size: 9867032
dataset_size: 14040026
- config_name: bjn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8375348
num_examples: 254558
download_size: 5722334
dataset_size: 8375348
- config_name: bm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 18145787
num_examples: 549694
download_size: 13129193
dataset_size: 18145787
- config_name: bn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 815803977
num_examples: 9767284
download_size: 261147329
dataset_size: 815803977
- config_name: bo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 11671330
num_examples: 278307
download_size: 5669602
dataset_size: 11671330
- config_name: bpy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15497749
num_examples: 347458
download_size: 6991190
dataset_size: 15497749
- config_name: bqi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8017455
num_examples: 241984
download_size: 4345123
dataset_size: 8017455
- config_name: br
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 58304963
num_examples: 1653800
download_size: 42722031
dataset_size: 58304963
- config_name: brh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5328437
num_examples: 171504
download_size: 3376189
dataset_size: 5328437
- config_name: bs
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 30441466
num_examples: 858190
download_size: 21606575
dataset_size: 30441466
- config_name: btm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4252525
num_examples: 143980
download_size: 2567218
dataset_size: 4252525
- config_name: bto
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12841721
num_examples: 407470
download_size: 8934218
dataset_size: 12841721
- config_name: bug
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 7595464
num_examples: 235268
download_size: 5129941
dataset_size: 7595464
- config_name: bxr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4713699
num_examples: 153707
download_size: 2869313
dataset_size: 4713699
- config_name: ca
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 408509932
num_examples: 9936886
download_size: 288474980
dataset_size: 408509932
- config_name: cbk-zam
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14108232
num_examples: 440345
download_size: 9920793
dataset_size: 14108232
- config_name: cdo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 6503254
num_examples: 201362
download_size: 4137841
dataset_size: 6503254
- config_name: ce
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 28093148
num_examples: 607767
download_size: 16367596
dataset_size: 28093148
- config_name: ceb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 332947091
num_examples: 7769402
download_size: 219525737
dataset_size: 332947091
- config_name: ch
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13983906
num_examples: 436785
download_size: 9817385
dataset_size: 13983906
- config_name: cho
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13950786
num_examples: 435869
download_size: 9791296
dataset_size: 13950786
- config_name: chr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5386793
num_examples: 172855
download_size: 3419676
dataset_size: 5386793
- config_name: chy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13994916
num_examples: 437007
download_size: 9830465
dataset_size: 13994916
- config_name: ckb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 23343034
num_examples: 511183
download_size: 11459344
dataset_size: 23343034
- config_name: co
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 47080480
num_examples: 1346929
download_size: 34551346
dataset_size: 47080480
- config_name: cps
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12849864
num_examples: 407695
download_size: 8941921
dataset_size: 12849864
- config_name: cr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5516556
num_examples: 176667
download_size: 3532952
dataset_size: 5516556
- config_name: crh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 10864382
num_examples: 336709
download_size: 7542853
dataset_size: 10864382
- config_name: crh-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4419064
num_examples: 148046
download_size: 2688683
dataset_size: 4419064
- config_name: crh-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14201429
num_examples: 442905
download_size: 9986290
dataset_size: 14201429
- config_name: cs
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 140189244
num_examples: 3384048
download_size: 97516751
dataset_size: 140189244
- config_name: csb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20177120
num_examples: 619275
download_size: 14528772
dataset_size: 20177120
- config_name: cv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8047221
num_examples: 215611
download_size: 4857718
dataset_size: 8047221
- config_name: cy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 89241808
num_examples: 2244550
download_size: 62686006
dataset_size: 89241808
- config_name: da
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 130931077
num_examples: 3448894
download_size: 98202417
dataset_size: 130931077
- config_name: dag
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 2664957
num_examples: 78534
download_size: 2052615
dataset_size: 2664957
- config_name: de
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 765398522
num_examples: 17531361
download_size: 527642124
dataset_size: 765398522
- config_name: de-at
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 53043722
num_examples: 1515373
download_size: 38761571
dataset_size: 53043722
- config_name: de-ch
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 53480908
num_examples: 1528137
download_size: 39349412
dataset_size: 53480908
- config_name: de-formal
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4256391
num_examples: 144061
download_size: 2571862
dataset_size: 4256391
- config_name: din
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12819746
num_examples: 406591
download_size: 8922303
dataset_size: 12819746
- config_name: diq
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 7570161
num_examples: 232674
download_size: 5057742
dataset_size: 7570161
- config_name: dsb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16135830
num_examples: 491423
download_size: 11412316
dataset_size: 16135830
- config_name: dtp
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13867373
num_examples: 433733
download_size: 9720699
dataset_size: 13867373
- config_name: dty
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8839082
num_examples: 246026
download_size: 4551845
dataset_size: 8839082
- config_name: dua
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 2631
num_examples: 87
download_size: 3877
dataset_size: 2631
- config_name: dv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 81396462
num_examples: 2103276
download_size: 45332104
dataset_size: 81396462
- config_name: dz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8590239
num_examples: 242196
download_size: 4406353
dataset_size: 8590239
- config_name: ee
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14377017
num_examples: 447208
download_size: 10136064
dataset_size: 14377017
- config_name: egl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13068224
num_examples: 413551
download_size: 9121776
dataset_size: 13068224
- config_name: el
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 32978562
num_examples: 592016
download_size: 19577876
dataset_size: 32978562
- config_name: eml
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14768563
num_examples: 458847
download_size: 10453636
dataset_size: 14768563
- config_name: en
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 6327454281
num_examples: 81801560
download_size: 4224231068
dataset_size: 6327454281
- config_name: en-ca
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 73305274
num_examples: 1909970
download_size: 53060194
dataset_size: 73305274
- config_name: en-gb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 115978412
num_examples: 2520405
download_size: 78924421
dataset_size: 115978412
- config_name: en-us
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14815
num_examples: 332
download_size: 9953
dataset_size: 14815
- config_name: eo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 256196064
num_examples: 6285304
download_size: 177219679
dataset_size: 256196064
- config_name: es
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 730214298
num_examples: 17233968
download_size: 514588069
dataset_size: 730214298
- config_name: es-419
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4355180
num_examples: 146476
download_size: 2659218
dataset_size: 4355180
- config_name: es-formal
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4280933
num_examples: 144717
download_size: 2592085
dataset_size: 4280933
- config_name: et
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 65123623
num_examples: 1820762
download_size: 48197302
dataset_size: 65123623
- config_name: eu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 290282374
num_examples: 7109758
download_size: 197889378
dataset_size: 290282374
- config_name: ext
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 223257222
num_examples: 5359047
download_size: 147078789
dataset_size: 223257222
- config_name: fa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 123727757
num_examples: 2142642
download_size: 65952114
dataset_size: 123727757
- config_name: ff
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14116652
num_examples: 440614
download_size: 9920388
dataset_size: 14116652
- config_name: fi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 286539944
num_examples: 6905698
download_size: 209916638
dataset_size: 286539944
- config_name: fit
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20217258
num_examples: 620391
download_size: 14566702
dataset_size: 20217258
- config_name: fj
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14159041
num_examples: 441745
download_size: 9956108
dataset_size: 14159041
- config_name: fkv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4328482
num_examples: 145988
download_size: 2619845
dataset_size: 4328482
- config_name: fo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 24474476
num_examples: 731732
download_size: 17876981
dataset_size: 24474476
- config_name: fr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 774128723
num_examples: 17908351
download_size: 534489308
dataset_size: 774128723
- config_name: frc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 17896106
num_examples: 547258
download_size: 12953740
dataset_size: 17896106
- config_name: frp
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 40902510
num_examples: 1191134
download_size: 29778105
dataset_size: 40902510
- config_name: frr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16979214
num_examples: 515350
download_size: 12069637
dataset_size: 16979214
- config_name: fur
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 42077410
num_examples: 1221071
download_size: 30714082
dataset_size: 42077410
- config_name: ga
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 471527543
num_examples: 11524282
download_size: 320967189
dataset_size: 471527543
- config_name: gag
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14149375
num_examples: 440732
download_size: 9940551
dataset_size: 14149375
- config_name: gan
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 31572161
num_examples: 905186
download_size: 18909564
dataset_size: 31572161
- config_name: gan-hans
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 31004794
num_examples: 889875
download_size: 18566811
dataset_size: 31004794
- config_name: gan-hant
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4374444
num_examples: 147098
download_size: 2657182
dataset_size: 4374444
- config_name: gcr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4311409
num_examples: 145829
download_size: 2618211
dataset_size: 4311409
- config_name: gd
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 49316935
num_examples: 1429457
download_size: 36220978
dataset_size: 49316935
- config_name: gl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 289484839
num_examples: 7052226
download_size: 197315151
dataset_size: 289484839
- config_name: glk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8327018
num_examples: 249115
download_size: 4538325
dataset_size: 8327018
- config_name: gn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14212974
num_examples: 442765
download_size: 10004863
dataset_size: 14212974
- config_name: gom
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4584575
num_examples: 150273
download_size: 2780570
dataset_size: 4584575
- config_name: gom-deva
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8585678
num_examples: 242131
download_size: 4400578
dataset_size: 8585678
- config_name: gom-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12783006
num_examples: 405302
download_size: 8897342
dataset_size: 12783006
- config_name: gor
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14667616
num_examples: 454512
download_size: 10319196
dataset_size: 14667616
- config_name: got
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5432139
num_examples: 172951
download_size: 3435531
dataset_size: 5432139
- config_name: grc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4494817
num_examples: 149631
download_size: 2746170
dataset_size: 4494817
- config_name: gu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 23788894
num_examples: 486140
download_size: 10779200
dataset_size: 23788894
- config_name: guc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 1419
num_examples: 38
download_size: 3054
dataset_size: 1419
- config_name: guw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 118
num_examples: 4
download_size: 1864
dataset_size: 118
- config_name: gv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20683485
num_examples: 631005
download_size: 14894590
dataset_size: 20683485
- config_name: ha
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14716168
num_examples: 455836
download_size: 10421790
dataset_size: 14716168
- config_name: hak
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 6128644
num_examples: 193036
download_size: 3991729
dataset_size: 6128644
- config_name: haw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14158084
num_examples: 441511
download_size: 9952975
dataset_size: 14158084
- config_name: he
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 43629050
num_examples: 884809
download_size: 27221301
dataset_size: 43629050
- config_name: hi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 37237187
num_examples: 668964
download_size: 17804873
dataset_size: 37237187
- config_name: hif
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14457954
num_examples: 449009
download_size: 10166264
dataset_size: 14457954
- config_name: hif-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14519845
num_examples: 454037
download_size: 10240704
dataset_size: 14519845
- config_name: hil
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12928914
num_examples: 409962
download_size: 9009705
dataset_size: 12928914
- config_name: ho
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13950504
num_examples: 435857
download_size: 9790849
dataset_size: 13950504
- config_name: hr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 61272623
num_examples: 1720527
download_size: 45307411
dataset_size: 61272623
- config_name: hrx
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12869295
num_examples: 407823
download_size: 8964114
dataset_size: 12869295
- config_name: hsb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 23720349
num_examples: 707100
download_size: 17145693
dataset_size: 23720349
- config_name: ht
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16835529
num_examples: 509955
download_size: 11880404
dataset_size: 16835529
- config_name: hu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 85054175
num_examples: 2200589
download_size: 64143342
dataset_size: 85054175
- config_name: hu-formal
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4252810
num_examples: 143986
download_size: 2567582
dataset_size: 4252810
- config_name: hy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 39339286
num_examples: 773925
download_size: 22108994
dataset_size: 39339286
- config_name: hyw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5443608
num_examples: 166902
download_size: 3238370
dataset_size: 5443608
- config_name: hz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13948574
num_examples: 435804
download_size: 9788697
dataset_size: 13948574
- config_name: ia
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 229143237
num_examples: 5616433
download_size: 155877454
dataset_size: 229143237
- config_name: id
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 95220928
num_examples: 2512331
download_size: 69525046
dataset_size: 95220928
- config_name: ie
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 225725262
num_examples: 5533032
download_size: 153371930
dataset_size: 225725262
- config_name: ig
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20109388
num_examples: 617044
download_size: 14475407
dataset_size: 20109388
- config_name: ii
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4310418
num_examples: 145332
download_size: 2609723
dataset_size: 4310418
- config_name: ik
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13989609
num_examples: 436958
download_size: 9823174
dataset_size: 13989609
- config_name: ike-cans
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4352278
num_examples: 146355
download_size: 2645174
dataset_size: 4352278
- config_name: ike-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13851135
num_examples: 432932
download_size: 9714057
dataset_size: 13851135
- config_name: ilo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15955483
num_examples: 480555
download_size: 11141942
dataset_size: 15955483
- config_name: inh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4634360
num_examples: 152226
download_size: 2831580
dataset_size: 4634360
- config_name: io
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 233656822
num_examples: 5757440
download_size: 159720058
dataset_size: 233656822
- config_name: is
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 51679396
num_examples: 1483610
download_size: 37965494
dataset_size: 51679396
- config_name: it
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 536601426
num_examples: 12631487
download_size: 375025347
dataset_size: 536601426
- config_name: iu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5360588
num_examples: 172215
download_size: 3402239
dataset_size: 5360588
- config_name: ja
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 140641579
num_examples: 2917962
download_size: 92145329
dataset_size: 140641579
- config_name: jam
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 18849751
num_examples: 571777
download_size: 13684422
dataset_size: 18849751
- config_name: jbo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14301985
num_examples: 446512
download_size: 9994516
dataset_size: 14301985
- config_name: jv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 27232302
num_examples: 794181
download_size: 19651565
dataset_size: 27232302
- config_name: ka
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 24073345
num_examples: 399546
download_size: 11679979
dataset_size: 24073345
- config_name: kaa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14082184
num_examples: 439411
download_size: 9902820
dataset_size: 14082184
- config_name: kab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 18459676
num_examples: 557857
download_size: 13384218
dataset_size: 18459676
- config_name: kbd
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4594409
num_examples: 149733
download_size: 2759503
dataset_size: 4594409
- config_name: kbd-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4417661
num_examples: 148017
download_size: 2687531
dataset_size: 4417661
- config_name: kbp
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12873178
num_examples: 408039
download_size: 8965474
dataset_size: 12873178
- config_name: kea
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12793700
num_examples: 405901
download_size: 8896866
dataset_size: 12793700
- config_name: kg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 40949149
num_examples: 1193499
download_size: 29766747
dataset_size: 40949149
- config_name: khw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4308653
num_examples: 145279
download_size: 2608581
dataset_size: 4308653
- config_name: ki
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14056900
num_examples: 439015
download_size: 9875534
dataset_size: 14056900
- config_name: kj
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13881723
num_examples: 433861
download_size: 9733715
dataset_size: 13881723
- config_name: kjp
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8504302
num_examples: 240339
download_size: 4341523
dataset_size: 8504302
- config_name: kk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 19216115
num_examples: 428880
download_size: 11577682
dataset_size: 19216115
- config_name: kk-arab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 7241749
num_examples: 211731
download_size: 4487032
dataset_size: 7241749
- config_name: kk-kz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4937945
num_examples: 160027
download_size: 3062906
dataset_size: 4937945
- config_name: kk-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 22197825
num_examples: 677162
download_size: 16072332
dataset_size: 22197825
- config_name: kk-tr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20060635
num_examples: 616521
download_size: 14438929
dataset_size: 20060635
- config_name: ko
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 60335212
num_examples: 1364440
download_size: 39186630
dataset_size: 60335212
- config_name: ko-kp
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4338717
num_examples: 146150
download_size: 2630925
dataset_size: 4338717
- config_name: koi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4737590
num_examples: 155082
download_size: 2894674
dataset_size: 4737590
- config_name: kr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13886057
num_examples: 433990
download_size: 9737602
dataset_size: 13886057
- config_name: krc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4646136
num_examples: 151026
download_size: 2785454
dataset_size: 4646136
- config_name: kri
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12798530
num_examples: 406032
download_size: 8902330
dataset_size: 12798530
- config_name: krj
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13850324
num_examples: 433444
download_size: 9703460
dataset_size: 13850324
- config_name: krl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12788020
num_examples: 405729
download_size: 8893337
dataset_size: 12788020
- config_name: ks
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4390604
num_examples: 147033
download_size: 2671069
dataset_size: 4390604
- config_name: ks-deva
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8567518
num_examples: 241832
download_size: 4387687
dataset_size: 8567518
- config_name: ksh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20394712
num_examples: 624523
download_size: 14698860
dataset_size: 20394712
- config_name: ku
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8037777
num_examples: 239515
download_size: 5306097
dataset_size: 8037777
- config_name: ku-arab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4577826
num_examples: 151290
download_size: 2796159
dataset_size: 4577826
- config_name: ku-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14683841
num_examples: 458802
download_size: 10371977
dataset_size: 14683841
- config_name: kum
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4252739
num_examples: 143985
download_size: 2567503
dataset_size: 4252739
- config_name: kv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4946978
num_examples: 158888
download_size: 2997865
dataset_size: 4946978
- config_name: kw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20245535
num_examples: 621432
download_size: 14581378
dataset_size: 20245535
- config_name: ky
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8909613
num_examples: 235165
download_size: 5462115
dataset_size: 8909613
- config_name: la
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 299766395
num_examples: 7085082
download_size: 201477460
dataset_size: 299766395
- config_name: lad
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20336417
num_examples: 622775
download_size: 14653199
dataset_size: 20336417
- config_name: lb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 56473066
num_examples: 1601093
download_size: 41410732
dataset_size: 56473066
- config_name: lbe
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4501470
num_examples: 149898
download_size: 2744786
dataset_size: 4501470
- config_name: lez
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4890798
num_examples: 155936
download_size: 2959653
dataset_size: 4890798
- config_name: lfn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14709210
num_examples: 456719
download_size: 10408539
dataset_size: 14709210
- config_name: lg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13979286
num_examples: 436009
download_size: 9802779
dataset_size: 13979286
- config_name: li
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 43476868
num_examples: 1253970
download_size: 31750932
dataset_size: 43476868
- config_name: lij
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 42327066
num_examples: 1227346
download_size: 30898971
dataset_size: 42327066
- config_name: liv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12781331
num_examples: 405236
download_size: 8895889
dataset_size: 12781331
- config_name: lki
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8039166
num_examples: 242526
download_size: 4363703
dataset_size: 8039166
- config_name: lld
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 90305
num_examples: 2634
download_size: 69672
dataset_size: 90305
- config_name: lmo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 18287638
num_examples: 545398
download_size: 13130119
dataset_size: 18287638
- config_name: ln
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14123637
num_examples: 439731
download_size: 9915851
dataset_size: 14123637
- config_name: lo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 9905189
num_examples: 271710
download_size: 5313218
dataset_size: 9905189
- config_name: loz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13695602
num_examples: 428723
download_size: 9581113
dataset_size: 13695602
- config_name: lt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 39902419
num_examples: 1096727
download_size: 29185765
dataset_size: 39902419
- config_name: ltg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13884707
num_examples: 433453
download_size: 9736637
dataset_size: 13884707
- config_name: lus
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13695197
num_examples: 428712
download_size: 9580538
dataset_size: 13695197
- config_name: luz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8459036
num_examples: 253454
download_size: 4687414
dataset_size: 8459036
- config_name: lv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 27242119
num_examples: 764753
download_size: 19676667
dataset_size: 27242119
- config_name: lzh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 25067538
num_examples: 685152
download_size: 14998856
dataset_size: 25067538
- config_name: mdf
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4634268
num_examples: 152141
download_size: 2820744
dataset_size: 4634268
- config_name: mg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 43863002
num_examples: 1271074
download_size: 32016826
dataset_size: 43863002
- config_name: mh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13775721
num_examples: 431162
download_size: 9644397
dataset_size: 13775721
- config_name: mi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20857040
num_examples: 637118
download_size: 15060301
dataset_size: 20857040
- config_name: min
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 53044258
num_examples: 1464128
download_size: 38587450
dataset_size: 53044258
- config_name: mk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 24087229
num_examples: 449241
download_size: 12217912
dataset_size: 24087229
- config_name: ml
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 189266798
num_examples: 2664923
download_size: 71344031
dataset_size: 189266798
- config_name: mn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 9311543
num_examples: 219695
download_size: 5272784
dataset_size: 9311543
- config_name: mni
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8696893
num_examples: 243616
download_size: 4470994
dataset_size: 8696893
- config_name: mnw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8861861
num_examples: 244906
download_size: 4517726
dataset_size: 8861861
- config_name: mo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5377009
num_examples: 172144
download_size: 3405661
dataset_size: 5377009
- config_name: mr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 26855182
num_examples: 526220
download_size: 12358679
dataset_size: 26855182
- config_name: mrh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 68
num_examples: 2
download_size: 1820
dataset_size: 68
- config_name: mrj
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5007903
num_examples: 160889
download_size: 3073431
dataset_size: 5007903
- config_name: ms
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 64674328
num_examples: 1803714
download_size: 47165217
dataset_size: 64674328
- config_name: ms-arab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 136496
num_examples: 2961
download_size: 92316
dataset_size: 136496
- config_name: mt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 22632686
num_examples: 682867
download_size: 16352572
dataset_size: 22632686
- config_name: mus
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14013416
num_examples: 437688
download_size: 9835239
dataset_size: 14013416
- config_name: mwl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14493299
num_examples: 448926
download_size: 10225888
dataset_size: 14493299
- config_name: my
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16182182
num_examples: 345096
download_size: 7981905
dataset_size: 16182182
- config_name: mzn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 17973941
num_examples: 447870
download_size: 9174617
dataset_size: 17973941
- config_name: na
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13992666
num_examples: 436956
download_size: 9823328
dataset_size: 13992666
- config_name: nah
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14490294
num_examples: 449748
download_size: 10192501
dataset_size: 14490294
- config_name: nan-hani
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 191
num_examples: 6
download_size: 1925
dataset_size: 191
- config_name: nap
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 42362346
num_examples: 1229161
download_size: 30918265
dataset_size: 42362346
- config_name: nb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 142554768
num_examples: 3688026
download_size: 105549981
dataset_size: 142554768
- config_name: nds
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 58766114
num_examples: 1666813
download_size: 43421948
dataset_size: 58766114
- config_name: nds-nl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 44121756
num_examples: 1273149
download_size: 32201410
dataset_size: 44121756
- config_name: ne
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 11925386
num_examples: 295006
download_size: 6265232
dataset_size: 11925386
- config_name: new
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16906308
num_examples: 350362
download_size: 7680329
dataset_size: 16906308
- config_name: ng
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13870754
num_examples: 433582
download_size: 9723795
dataset_size: 13870754
- config_name: nia
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20649
num_examples: 515
download_size: 16535
dataset_size: 20649
- config_name: niu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12794247
num_examples: 405902
download_size: 8897260
dataset_size: 12794247
- config_name: nl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5016576732
num_examples: 61931959
download_size: 3380404239
dataset_size: 5016576732
- config_name: nn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 99997815
num_examples: 2708994
download_size: 74736304
dataset_size: 99997815
- config_name: 'no'
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 2934
num_examples: 64
download_size: 4108
dataset_size: 2934
- config_name: nod
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4322068
num_examples: 145566
download_size: 2618106
dataset_size: 4322068
- config_name: nov
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14150434
num_examples: 440903
download_size: 9947798
dataset_size: 14150434
- config_name: nqo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8094271
num_examples: 243184
download_size: 4398836
dataset_size: 8094271
- config_name: nrm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 41330956
num_examples: 1203295
download_size: 30084065
dataset_size: 41330956
- config_name: nso
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14178321
num_examples: 443205
download_size: 9959708
dataset_size: 14178321
- config_name: nv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15351770
num_examples: 455188
download_size: 10472240
dataset_size: 15351770
- config_name: ny
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13989813
num_examples: 436764
download_size: 9821588
dataset_size: 13989813
- config_name: nys
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13092059
num_examples: 413241
download_size: 9153100
dataset_size: 13092059
- config_name: oc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 266612548
num_examples: 6569770
download_size: 180156462
dataset_size: 266612548
- config_name: olo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13200388
num_examples: 416935
download_size: 9214968
dataset_size: 13200388
- config_name: om
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5476389
num_examples: 175314
download_size: 3496637
dataset_size: 5476389
- config_name: or
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 22798709
num_examples: 470237
download_size: 10322832
dataset_size: 22798709
- config_name: os
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5946062
num_examples: 177054
download_size: 3583703
dataset_size: 5946062
- config_name: ota
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8015024
num_examples: 241903
download_size: 4343478
dataset_size: 8015024
- config_name: pa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20505754
num_examples: 481522
download_size: 10552147
dataset_size: 20505754
- config_name: pam
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14527964
num_examples: 451253
download_size: 10242443
dataset_size: 14527964
- config_name: pap
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 54505401
num_examples: 1449881
download_size: 40415776
dataset_size: 54505401
- config_name: pcd
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 42132826
num_examples: 1221362
download_size: 30766812
dataset_size: 42132826
- config_name: pdc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14435256
num_examples: 448055
download_size: 10178322
dataset_size: 14435256
- config_name: pdt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13994892
num_examples: 437200
download_size: 9819388
dataset_size: 13994892
- config_name: pfl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15461023
num_examples: 474198
download_size: 10893651
dataset_size: 15461023
- config_name: pi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8913354
num_examples: 250251
download_size: 4651392
dataset_size: 8913354
- config_name: pih
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13971081
num_examples: 436214
download_size: 9810653
dataset_size: 13971081
- config_name: pl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 426030491
num_examples: 10025139
download_size: 295767506
dataset_size: 426030491
- config_name: pms
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 51268512
num_examples: 1477043
download_size: 37698831
dataset_size: 51268512
- config_name: pnb
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16192682
num_examples: 409037
download_size: 9196626
dataset_size: 16192682
- config_name: pnt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4439173
num_examples: 148336
download_size: 2703117
dataset_size: 4439173
- config_name: prg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 17940420
num_examples: 544030
download_size: 12958482
dataset_size: 17940420
- config_name: ps
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8860902
num_examples: 259186
download_size: 4916502
dataset_size: 8860902
- config_name: pt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 491184040
num_examples: 11574568
download_size: 340831923
dataset_size: 491184040
- config_name: pt-br
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 318857431
num_examples: 7782980
download_size: 223442911
dataset_size: 318857431
- config_name: pwn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8500
num_examples: 269
download_size: 8738
dataset_size: 8500
- config_name: qu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15254702
num_examples: 468823
download_size: 10750388
dataset_size: 15254702
- config_name: quc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 32
num_examples: 1
download_size: 1772
dataset_size: 32
- config_name: qug
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13798264
num_examples: 431733
download_size: 9661685
dataset_size: 13798264
- config_name: rgn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 17001688
num_examples: 519871
download_size: 12258201
dataset_size: 17001688
- config_name: rif
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13792951
num_examples: 431588
download_size: 9657698
dataset_size: 13792951
- config_name: rm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 44450577
num_examples: 1284908
download_size: 32519630
dataset_size: 44450577
- config_name: rmc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 159
num_examples: 4
download_size: 1963
dataset_size: 159
- config_name: rmy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5610156
num_examples: 179191
download_size: 3608283
dataset_size: 5610156
- config_name: rn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13935534
num_examples: 435271
download_size: 9779486
dataset_size: 13935534
- config_name: ro
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 247469452
num_examples: 5878366
download_size: 177525205
dataset_size: 247469452
- config_name: roa-tara
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14425120
num_examples: 448972
download_size: 10152875
dataset_size: 14425120
- config_name: ru
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 405103215
num_examples: 7485811
download_size: 257215625
dataset_size: 405103215
- config_name: rue
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4953403
num_examples: 159530
download_size: 3037824
dataset_size: 4953403
- config_name: rup
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14459686
num_examples: 450345
download_size: 10198398
dataset_size: 14459686
- config_name: ruq-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4434290
num_examples: 148404
download_size: 2700920
dataset_size: 4434290
- config_name: ruq-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13783683
num_examples: 430978
download_size: 9656941
dataset_size: 13783683
- config_name: rw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14090196
num_examples: 439172
download_size: 9901257
dataset_size: 14090196
- config_name: rwr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8568706
num_examples: 241841
download_size: 4388475
dataset_size: 8568706
- config_name: ryu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 2852
num_examples: 82
download_size: 4237
dataset_size: 2852
- config_name: sa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 21404327
num_examples: 455674
download_size: 9692464
dataset_size: 21404327
- config_name: sat
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 10810040
num_examples: 284911
download_size: 5750917
dataset_size: 10810040
- config_name: sc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 47195572
num_examples: 1348137
download_size: 34521764
dataset_size: 47195572
- config_name: scn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 43458983
num_examples: 1259067
download_size: 31775157
dataset_size: 43458983
- config_name: sco
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 56960413
num_examples: 1611092
download_size: 41724559
dataset_size: 56960413
- config_name: sd
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14257513
num_examples: 363318
download_size: 7844047
dataset_size: 14257513
- config_name: sdc
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13975497
num_examples: 436913
download_size: 9800517
dataset_size: 13975497
- config_name: se
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 23962268
num_examples: 711439
download_size: 17409387
dataset_size: 23962268
- config_name: sei
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13827581
num_examples: 432520
download_size: 9684192
dataset_size: 13827581
- config_name: sg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13913524
num_examples: 434751
download_size: 9761739
dataset_size: 13913524
- config_name: sh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 30173635
num_examples: 746207
download_size: 20133594
dataset_size: 30173635
- config_name: shi-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13783218
num_examples: 430968
download_size: 9656828
dataset_size: 13783218
- config_name: shi-tfng
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4308577
num_examples: 145279
download_size: 2608525
dataset_size: 4308577
- config_name: shn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 10139002
num_examples: 260808
download_size: 4952168
dataset_size: 10139002
- config_name: shy-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4255322
num_examples: 144058
download_size: 2570625
dataset_size: 4255322
- config_name: si
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 7405400
num_examples: 189718
download_size: 4270591
dataset_size: 7405400
- config_name: sjd
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4300688
num_examples: 145047
download_size: 2604357
dataset_size: 4300688
- config_name: sje
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20970223
num_examples: 637639
download_size: 15120381
dataset_size: 20970223
- config_name: sju
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4315103
num_examples: 145655
download_size: 2620763
dataset_size: 4315103
- config_name: sk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 75586366
num_examples: 2050873
download_size: 54951330
dataset_size: 75586366
- config_name: skr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4274062
num_examples: 144443
download_size: 2585286
dataset_size: 4274062
- config_name: sl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 157883240
num_examples: 4112048
download_size: 118047353
dataset_size: 157883240
- config_name: sli
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13909208
num_examples: 434986
download_size: 9745964
dataset_size: 13909208
- config_name: sm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13984823
num_examples: 436830
download_size: 9817472
dataset_size: 13984823
- config_name: sma
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20653595
num_examples: 630437
download_size: 14902319
dataset_size: 20653595
- config_name: smj
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 19640206
num_examples: 604326
download_size: 14133964
dataset_size: 19640206
- config_name: smn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 10902411
num_examples: 337543
download_size: 7576850
dataset_size: 10902411
- config_name: sms
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4462345
num_examples: 149355
download_size: 2741038
dataset_size: 4462345
- config_name: sn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20116601
num_examples: 618231
download_size: 14463728
dataset_size: 20116601
- config_name: sq
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 304708913
num_examples: 7311820
download_size: 225592169
dataset_size: 304708913
- config_name: sr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 52787253
num_examples: 1018361
download_size: 31364006
dataset_size: 52787253
- config_name: sr-ec
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 9237541
num_examples: 248556
download_size: 5875548
dataset_size: 9237541
- config_name: sr-el
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 48848162
num_examples: 1418824
download_size: 35859120
dataset_size: 48848162
- config_name: srq
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12796525
num_examples: 405957
download_size: 8899493
dataset_size: 12796525
- config_name: ss
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13823630
num_examples: 432423
download_size: 9682165
dataset_size: 13823630
- config_name: st
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13938937
num_examples: 435419
download_size: 9785161
dataset_size: 13938937
- config_name: stq
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14484394
num_examples: 449885
download_size: 10228446
dataset_size: 14484394
- config_name: su
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20025826
num_examples: 583096
download_size: 14042822
dataset_size: 20025826
- config_name: sv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 339074900
num_examples: 8115455
download_size: 236022796
dataset_size: 339074900
- config_name: sw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 50612064
num_examples: 1465385
download_size: 37096369
dataset_size: 50612064
- config_name: szl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16772062
num_examples: 500107
download_size: 11868254
dataset_size: 16772062
- config_name: szy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4332021
num_examples: 146136
download_size: 2633271
dataset_size: 4332021
- config_name: ta
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 31251824
num_examples: 546558
download_size: 15157673
dataset_size: 31251824
- config_name: tay
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4345269
num_examples: 146938
download_size: 2632535
dataset_size: 4345269
- config_name: tcy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 8723594
num_examples: 244350
download_size: 4487471
dataset_size: 8723594
- config_name: te
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 27587665
num_examples: 569615
download_size: 13669398
dataset_size: 27587665
- config_name: tet
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15092299
num_examples: 466244
download_size: 10702917
dataset_size: 15092299
- config_name: tg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 12643125
num_examples: 304625
download_size: 7622522
dataset_size: 12643125
- config_name: tg-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4504034
num_examples: 149533
download_size: 2755000
dataset_size: 4504034
- config_name: tg-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 19845835
num_examples: 610020
download_size: 14264492
dataset_size: 19845835
- config_name: th
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 32693750
num_examples: 537447
download_size: 15849247
dataset_size: 32693750
- config_name: ti
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4366995
num_examples: 146479
download_size: 2648869
dataset_size: 4366995
- config_name: tk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5797050
num_examples: 184302
download_size: 3728802
dataset_size: 5797050
- config_name: tl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13661554
num_examples: 387377
download_size: 9456413
dataset_size: 13661554
- config_name: tly
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4309748
num_examples: 145312
download_size: 2609307
dataset_size: 4309748
- config_name: tly-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 35
num_examples: 1
download_size: 1793
dataset_size: 35
- config_name: tn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13936132
num_examples: 435219
download_size: 9780279
dataset_size: 13936132
- config_name: to
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13980327
num_examples: 436460
download_size: 9810650
dataset_size: 13980327
- config_name: tpi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14169019
num_examples: 442133
download_size: 9961827
dataset_size: 14169019
- config_name: tr
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 72134544
num_examples: 1770267
download_size: 51032484
dataset_size: 72134544
- config_name: tru
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5322844
num_examples: 171327
download_size: 3371105
dataset_size: 5322844
- config_name: trv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 94285
num_examples: 3109
download_size: 65138
dataset_size: 94285
- config_name: ts
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13943481
num_examples: 435408
download_size: 9783789
dataset_size: 13943481
- config_name: tt
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 24182976
num_examples: 548502
download_size: 14868166
dataset_size: 24182976
- config_name: tt-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4943914
num_examples: 158198
download_size: 3048932
dataset_size: 4943914
- config_name: tt-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13842972
num_examples: 432513
download_size: 9702714
dataset_size: 13842972
- config_name: tum
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13924159
num_examples: 435110
download_size: 9770501
dataset_size: 13924159
- config_name: tw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13830508
num_examples: 432669
download_size: 9688164
dataset_size: 13830508
- config_name: ty
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 16816401
num_examples: 507332
download_size: 12098154
dataset_size: 16816401
- config_name: tyv
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4583082
num_examples: 149929
download_size: 2779632
dataset_size: 4583082
- config_name: tzm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4253588
num_examples: 144002
download_size: 2569067
dataset_size: 4253588
- config_name: udm
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4854947
num_examples: 156300
download_size: 2958444
dataset_size: 4854947
- config_name: ug-arab
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4316690
num_examples: 145443
download_size: 2614962
dataset_size: 4316690
- config_name: ug-latn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13786474
num_examples: 431056
download_size: 9659723
dataset_size: 13786474
- config_name: uk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 251058352
num_examples: 5108733
download_size: 168140976
dataset_size: 251058352
- config_name: ur
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 57063750
num_examples: 987011
download_size: 28328459
dataset_size: 57063750
- config_name: uz
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 11731793
num_examples: 344615
download_size: 8102734
dataset_size: 11731793
- config_name: uz-cyrl
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4252574
num_examples: 143981
download_size: 2567325
dataset_size: 4252574
- config_name: ve
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 13932174
num_examples: 435216
download_size: 9777266
dataset_size: 13932174
- config_name: vec
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 52081230
num_examples: 1466867
download_size: 37307805
dataset_size: 52081230
- config_name: vep
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 6174898
num_examples: 192298
download_size: 3994582
dataset_size: 6174898
- config_name: vi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 246835524
num_examples: 5743737
download_size: 172949263
dataset_size: 246835524
- config_name: vls
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 42789297
num_examples: 1239359
download_size: 31228294
dataset_size: 42789297
- config_name: vmf
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 18352990
num_examples: 555205
download_size: 13289296
dataset_size: 18352990
- config_name: vo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 228352533
num_examples: 5610875
download_size: 155496988
dataset_size: 228352533
- config_name: vot
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5406190
num_examples: 173486
download_size: 3439433
dataset_size: 5406190
- config_name: wa
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 49235347
num_examples: 1426584
download_size: 36167816
dataset_size: 49235347
- config_name: war
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 190306474
num_examples: 4449062
download_size: 133786270
dataset_size: 190306474
- config_name: wls
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4033
num_examples: 104
download_size: 5150
dataset_size: 4033
- config_name: wo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 40961626
num_examples: 1193626
download_size: 29778666
dataset_size: 40961626
- config_name: wuu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 40570130
num_examples: 1127741
download_size: 24209117
dataset_size: 40570130
- config_name: wya
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 28
num_examples: 1
download_size: 1740
dataset_size: 28
- config_name: xal
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4475344
num_examples: 149984
download_size: 2722459
dataset_size: 4475344
- config_name: xh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 20036194
num_examples: 615514
download_size: 14405310
dataset_size: 20036194
- config_name: xmf
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5943645
num_examples: 169507
download_size: 3418593
dataset_size: 5943645
- config_name: xsy
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4262789
num_examples: 144305
download_size: 2573349
dataset_size: 4262789
- config_name: yav
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4070
num_examples: 102
download_size: 4718
dataset_size: 4070
- config_name: yi
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 5495313
num_examples: 170277
download_size: 3373820
dataset_size: 5495313
- config_name: yo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 25424749
num_examples: 724345
download_size: 18086773
dataset_size: 25424749
- config_name: za
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15159230
num_examples: 365892
download_size: 7774767
dataset_size: 15159230
- config_name: zea
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 14538518
num_examples: 451577
download_size: 10262897
dataset_size: 14538518
- config_name: zgh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 4253917
num_examples: 144006
download_size: 2569373
dataset_size: 4253917
- config_name: zh
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 264353677
num_examples: 5424320
download_size: 174420118
dataset_size: 264353677
- config_name: zh-cn
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 42868611
num_examples: 1158755
download_size: 27243799
dataset_size: 42868611
- config_name: zh-hans
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 57233156
num_examples: 1483225
download_size: 36583522
dataset_size: 57233156
- config_name: zh-hant
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 53502814
num_examples: 1356560
download_size: 36755083
dataset_size: 53502814
- config_name: zh-hk
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 15325323
num_examples: 408391
download_size: 10455809
dataset_size: 15325323
- config_name: zh-mo
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 6568267
num_examples: 180950
download_size: 3547260
dataset_size: 6568267
- config_name: zh-my
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 32637498
num_examples: 916876
download_size: 19289581
dataset_size: 32637498
- config_name: zh-sg
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 35325327
num_examples: 979652
download_size: 21150070
dataset_size: 35325327
- config_name: zh-tw
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 17500668
num_examples: 443057
download_size: 11121104
dataset_size: 17500668
- config_name: zh-yue
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 1352
num_examples: 30
download_size: 2963
dataset_size: 1352
- config_name: zu
features:
- name: wikidata_id
dtype: string
- name: lastrevid
dtype: int64
- name: label
dtype: string
splits:
- name: label
num_bytes: 47349379
num_examples: 1380550
download_size: 34649660
dataset_size: 47349379
configs:
- config_name: aa
data_files:
- split: label
path: aa/label-*
- config_name: ab
data_files:
- split: label
path: ab/label-*
- config_name: abs
data_files:
- split: label
path: abs/label-*
- config_name: ace
data_files:
- split: label
path: ace/label-*
- config_name: ady
data_files:
- split: label
path: ady/label-*
- config_name: ady-cyrl
data_files:
- split: label
path: ady-cyrl/label-*
- config_name: aeb
data_files:
- split: label
path: aeb/label-*
- config_name: aeb-arab
data_files:
- split: label
path: aeb-arab/label-*
- config_name: aeb-latn
data_files:
- split: label
path: aeb-latn/label-*
- config_name: af
data_files:
- split: label
path: af/label-*
- config_name: agq
data_files:
- split: label
path: agq/label-*
- config_name: ak
data_files:
- split: label
path: ak/label-*
- config_name: aln
data_files:
- split: label
path: aln/label-*
- config_name: als
data_files:
- split: label
path: als/label-*
- config_name: alt
data_files:
- split: label
path: alt/label-*
- config_name: am
data_files:
- split: label
path: am/label-*
- config_name: ami
data_files:
- split: label
path: ami/label-*
- config_name: an
data_files:
- split: label
path: an/label-*
- config_name: ang
data_files:
- split: label
path: ang/label-*
- config_name: anp
data_files:
- split: label
path: anp/label-*
- config_name: ar
data_files:
- split: label
path: ar/label-*
- config_name: arc
data_files:
- split: label
path: arc/label-*
- config_name: arn
data_files:
- split: label
path: arn/label-*
- config_name: arq
data_files:
- split: label
path: arq/label-*
- config_name: ary
data_files:
- split: label
path: ary/label-*
- config_name: arz
data_files:
- split: label
path: arz/label-*
- config_name: as
data_files:
- split: label
path: as/label-*
- config_name: ase
data_files:
- split: label
path: ase/label-*
- config_name: ast
data_files:
- split: label
path: ast/label-*
- config_name: atj
data_files:
- split: label
path: atj/label-*
- config_name: av
data_files:
- split: label
path: av/label-*
- config_name: avk
data_files:
- split: label
path: avk/label-*
- config_name: awa
data_files:
- split: label
path: awa/label-*
- config_name: ay
data_files:
- split: label
path: ay/label-*
- config_name: az
data_files:
- split: label
path: az/label-*
- config_name: azb
data_files:
- split: label
path: azb/label-*
- config_name: ba
data_files:
- split: label
path: ba/label-*
- config_name: ban
data_files:
- split: label
path: ban/label-*
- config_name: ban-bali
data_files:
- split: label
path: ban-bali/label-*
- config_name: bar
data_files:
- split: label
path: bar/label-*
- config_name: bbc
data_files:
- split: label
path: bbc/label-*
- config_name: bcc
data_files:
- split: label
path: bcc/label-*
- config_name: be
data_files:
- split: label
path: be/label-*
- config_name: be-tarask
data_files:
- split: label
path: be-tarask/label-*
- config_name: bg
data_files:
- split: label
path: bg/label-*
- config_name: bgn
data_files:
- split: label
path: bgn/label-*
- config_name: bi
data_files:
- split: label
path: bi/label-*
- config_name: bjn
data_files:
- split: label
path: bjn/label-*
- config_name: bm
data_files:
- split: label
path: bm/label-*
- config_name: bn
data_files:
- split: label
path: bn/label-*
- config_name: bo
data_files:
- split: label
path: bo/label-*
- config_name: bpy
data_files:
- split: label
path: bpy/label-*
- config_name: bqi
data_files:
- split: label
path: bqi/label-*
- config_name: br
data_files:
- split: label
path: br/label-*
- config_name: brh
data_files:
- split: label
path: brh/label-*
- config_name: bs
data_files:
- split: label
path: bs/label-*
- config_name: btm
data_files:
- split: label
path: btm/label-*
- config_name: bto
data_files:
- split: label
path: bto/label-*
- config_name: bug
data_files:
- split: label
path: bug/label-*
- config_name: bxr
data_files:
- split: label
path: bxr/label-*
- config_name: ca
data_files:
- split: label
path: ca/label-*
- config_name: cbk-zam
data_files:
- split: label
path: cbk-zam/label-*
- config_name: cdo
data_files:
- split: label
path: cdo/label-*
- config_name: ce
data_files:
- split: label
path: ce/label-*
- config_name: ceb
data_files:
- split: label
path: ceb/label-*
- config_name: ch
data_files:
- split: label
path: ch/label-*
- config_name: cho
data_files:
- split: label
path: cho/label-*
- config_name: chr
data_files:
- split: label
path: chr/label-*
- config_name: chy
data_files:
- split: label
path: chy/label-*
- config_name: ckb
data_files:
- split: label
path: ckb/label-*
- config_name: co
data_files:
- split: label
path: co/label-*
- config_name: cps
data_files:
- split: label
path: cps/label-*
- config_name: cr
data_files:
- split: label
path: cr/label-*
- config_name: crh
data_files:
- split: label
path: crh/label-*
- config_name: crh-cyrl
data_files:
- split: label
path: crh-cyrl/label-*
- config_name: crh-latn
data_files:
- split: label
path: crh-latn/label-*
- config_name: cs
data_files:
- split: label
path: cs/label-*
- config_name: csb
data_files:
- split: label
path: csb/label-*
- config_name: cv
data_files:
- split: label
path: cv/label-*
- config_name: cy
data_files:
- split: label
path: cy/label-*
- config_name: da
data_files:
- split: label
path: da/label-*
- config_name: dag
data_files:
- split: label
path: dag/label-*
- config_name: de
data_files:
- split: label
path: de/label-*
- config_name: de-at
data_files:
- split: label
path: de-at/label-*
- config_name: de-ch
data_files:
- split: label
path: de-ch/label-*
- config_name: de-formal
data_files:
- split: label
path: de-formal/label-*
- config_name: din
data_files:
- split: label
path: din/label-*
- config_name: diq
data_files:
- split: label
path: diq/label-*
- config_name: dsb
data_files:
- split: label
path: dsb/label-*
- config_name: dtp
data_files:
- split: label
path: dtp/label-*
- config_name: dty
data_files:
- split: label
path: dty/label-*
- config_name: dua
data_files:
- split: label
path: dua/label-*
- config_name: dv
data_files:
- split: label
path: dv/label-*
- config_name: dz
data_files:
- split: label
path: dz/label-*
- config_name: ee
data_files:
- split: label
path: ee/label-*
- config_name: egl
data_files:
- split: label
path: egl/label-*
- config_name: el
data_files:
- split: label
path: el/label-*
- config_name: eml
data_files:
- split: label
path: eml/label-*
- config_name: en
data_files:
- split: label
path: en/label-*
default: true
- config_name: en-ca
data_files:
- split: label
path: en-ca/label-*
- config_name: en-gb
data_files:
- split: label
path: en-gb/label-*
- config_name: en-us
data_files:
- split: label
path: en-us/label-*
- config_name: eo
data_files:
- split: label
path: eo/label-*
- config_name: es
data_files:
- split: label
path: es/label-*
- config_name: es-419
data_files:
- split: label
path: es-419/label-*
- config_name: es-formal
data_files:
- split: label
path: es-formal/label-*
- config_name: et
data_files:
- split: label
path: et/label-*
- config_name: eu
data_files:
- split: label
path: eu/label-*
- config_name: ext
data_files:
- split: label
path: ext/label-*
- config_name: fa
data_files:
- split: label
path: fa/label-*
- config_name: ff
data_files:
- split: label
path: ff/label-*
- config_name: fi
data_files:
- split: label
path: fi/label-*
- config_name: fit
data_files:
- split: label
path: fit/label-*
- config_name: fj
data_files:
- split: label
path: fj/label-*
- config_name: fkv
data_files:
- split: label
path: fkv/label-*
- config_name: fo
data_files:
- split: label
path: fo/label-*
- config_name: fr
data_files:
- split: label
path: fr/label-*
- config_name: frc
data_files:
- split: label
path: frc/label-*
- config_name: frp
data_files:
- split: label
path: frp/label-*
- config_name: frr
data_files:
- split: label
path: frr/label-*
- config_name: fur
data_files:
- split: label
path: fur/label-*
- config_name: ga
data_files:
- split: label
path: ga/label-*
- config_name: gag
data_files:
- split: label
path: gag/label-*
- config_name: gan
data_files:
- split: label
path: gan/label-*
- config_name: gan-hans
data_files:
- split: label
path: gan-hans/label-*
- config_name: gan-hant
data_files:
- split: label
path: gan-hant/label-*
- config_name: gcr
data_files:
- split: label
path: gcr/label-*
- config_name: gd
data_files:
- split: label
path: gd/label-*
- config_name: gl
data_files:
- split: label
path: gl/label-*
- config_name: glk
data_files:
- split: label
path: glk/label-*
- config_name: gn
data_files:
- split: label
path: gn/label-*
- config_name: gom
data_files:
- split: label
path: gom/label-*
- config_name: gom-deva
data_files:
- split: label
path: gom-deva/label-*
- config_name: gom-latn
data_files:
- split: label
path: gom-latn/label-*
- config_name: gor
data_files:
- split: label
path: gor/label-*
- config_name: got
data_files:
- split: label
path: got/label-*
- config_name: grc
data_files:
- split: label
path: grc/label-*
- config_name: gu
data_files:
- split: label
path: gu/label-*
- config_name: guc
data_files:
- split: label
path: guc/label-*
- config_name: guw
data_files:
- split: label
path: guw/label-*
- config_name: gv
data_files:
- split: label
path: gv/label-*
- config_name: ha
data_files:
- split: label
path: ha/label-*
- config_name: hak
data_files:
- split: label
path: hak/label-*
- config_name: haw
data_files:
- split: label
path: haw/label-*
- config_name: he
data_files:
- split: label
path: he/label-*
- config_name: hi
data_files:
- split: label
path: hi/label-*
- config_name: hif
data_files:
- split: label
path: hif/label-*
- config_name: hif-latn
data_files:
- split: label
path: hif-latn/label-*
- config_name: hil
data_files:
- split: label
path: hil/label-*
- config_name: ho
data_files:
- split: label
path: ho/label-*
- config_name: hr
data_files:
- split: label
path: hr/label-*
- config_name: hrx
data_files:
- split: label
path: hrx/label-*
- config_name: hsb
data_files:
- split: label
path: hsb/label-*
- config_name: ht
data_files:
- split: label
path: ht/label-*
- config_name: hu
data_files:
- split: label
path: hu/label-*
- config_name: hu-formal
data_files:
- split: label
path: hu-formal/label-*
- config_name: hy
data_files:
- split: label
path: hy/label-*
- config_name: hyw
data_files:
- split: label
path: hyw/label-*
- config_name: hz
data_files:
- split: label
path: hz/label-*
- config_name: ia
data_files:
- split: label
path: ia/label-*
- config_name: id
data_files:
- split: label
path: id/label-*
- config_name: ie
data_files:
- split: label
path: ie/label-*
- config_name: ig
data_files:
- split: label
path: ig/label-*
- config_name: ii
data_files:
- split: label
path: ii/label-*
- config_name: ik
data_files:
- split: label
path: ik/label-*
- config_name: ike-cans
data_files:
- split: label
path: ike-cans/label-*
- config_name: ike-latn
data_files:
- split: label
path: ike-latn/label-*
- config_name: ilo
data_files:
- split: label
path: ilo/label-*
- config_name: inh
data_files:
- split: label
path: inh/label-*
- config_name: io
data_files:
- split: label
path: io/label-*
- config_name: is
data_files:
- split: label
path: is/label-*
- config_name: it
data_files:
- split: label
path: it/label-*
- config_name: iu
data_files:
- split: label
path: iu/label-*
- config_name: ja
data_files:
- split: label
path: ja/label-*
- config_name: jam
data_files:
- split: label
path: jam/label-*
- config_name: jbo
data_files:
- split: label
path: jbo/label-*
- config_name: jv
data_files:
- split: label
path: jv/label-*
- config_name: ka
data_files:
- split: label
path: ka/label-*
- config_name: kaa
data_files:
- split: label
path: kaa/label-*
- config_name: kab
data_files:
- split: label
path: kab/label-*
- config_name: kbd
data_files:
- split: label
path: kbd/label-*
- config_name: kbd-cyrl
data_files:
- split: label
path: kbd-cyrl/label-*
- config_name: kbp
data_files:
- split: label
path: kbp/label-*
- config_name: kea
data_files:
- split: label
path: kea/label-*
- config_name: kg
data_files:
- split: label
path: kg/label-*
- config_name: khw
data_files:
- split: label
path: khw/label-*
- config_name: ki
data_files:
- split: label
path: ki/label-*
- config_name: kj
data_files:
- split: label
path: kj/label-*
- config_name: kjp
data_files:
- split: label
path: kjp/label-*
- config_name: kk
data_files:
- split: label
path: kk/label-*
- config_name: kk-arab
data_files:
- split: label
path: kk-arab/label-*
- config_name: kk-kz
data_files:
- split: label
path: kk-kz/label-*
- config_name: kk-latn
data_files:
- split: label
path: kk-latn/label-*
- config_name: kk-tr
data_files:
- split: label
path: kk-tr/label-*
- config_name: ko
data_files:
- split: label
path: ko/label-*
- config_name: ko-kp
data_files:
- split: label
path: ko-kp/label-*
- config_name: koi
data_files:
- split: label
path: koi/label-*
- config_name: kr
data_files:
- split: label
path: kr/label-*
- config_name: krc
data_files:
- split: label
path: krc/label-*
- config_name: kri
data_files:
- split: label
path: kri/label-*
- config_name: krj
data_files:
- split: label
path: krj/label-*
- config_name: krl
data_files:
- split: label
path: krl/label-*
- config_name: ks
data_files:
- split: label
path: ks/label-*
- config_name: ks-deva
data_files:
- split: label
path: ks-deva/label-*
- config_name: ksh
data_files:
- split: label
path: ksh/label-*
- config_name: ku
data_files:
- split: label
path: ku/label-*
- config_name: ku-arab
data_files:
- split: label
path: ku-arab/label-*
- config_name: ku-latn
data_files:
- split: label
path: ku-latn/label-*
- config_name: kum
data_files:
- split: label
path: kum/label-*
- config_name: kv
data_files:
- split: label
path: kv/label-*
- config_name: kw
data_files:
- split: label
path: kw/label-*
- config_name: ky
data_files:
- split: label
path: ky/label-*
- config_name: la
data_files:
- split: label
path: la/label-*
- config_name: lad
data_files:
- split: label
path: lad/label-*
- config_name: lb
data_files:
- split: label
path: lb/label-*
- config_name: lbe
data_files:
- split: label
path: lbe/label-*
- config_name: lez
data_files:
- split: label
path: lez/label-*
- config_name: lfn
data_files:
- split: label
path: lfn/label-*
- config_name: lg
data_files:
- split: label
path: lg/label-*
- config_name: li
data_files:
- split: label
path: li/label-*
- config_name: lij
data_files:
- split: label
path: lij/label-*
- config_name: liv
data_files:
- split: label
path: liv/label-*
- config_name: lki
data_files:
- split: label
path: lki/label-*
- config_name: lld
data_files:
- split: label
path: lld/label-*
- config_name: lmo
data_files:
- split: label
path: lmo/label-*
- config_name: ln
data_files:
- split: label
path: ln/label-*
- config_name: lo
data_files:
- split: label
path: lo/label-*
- config_name: loz
data_files:
- split: label
path: loz/label-*
- config_name: lt
data_files:
- split: label
path: lt/label-*
- config_name: ltg
data_files:
- split: label
path: ltg/label-*
- config_name: lus
data_files:
- split: label
path: lus/label-*
- config_name: luz
data_files:
- split: label
path: luz/label-*
- config_name: lv
data_files:
- split: label
path: lv/label-*
- config_name: lzh
data_files:
- split: label
path: lzh/label-*
- config_name: mdf
data_files:
- split: label
path: mdf/label-*
- config_name: mg
data_files:
- split: label
path: mg/label-*
- config_name: mh
data_files:
- split: label
path: mh/label-*
- config_name: mi
data_files:
- split: label
path: mi/label-*
- config_name: min
data_files:
- split: label
path: min/label-*
- config_name: mk
data_files:
- split: label
path: mk/label-*
- config_name: ml
data_files:
- split: label
path: ml/label-*
- config_name: mn
data_files:
- split: label
path: mn/label-*
- config_name: mni
data_files:
- split: label
path: mni/label-*
- config_name: mnw
data_files:
- split: label
path: mnw/label-*
- config_name: mo
data_files:
- split: label
path: mo/label-*
- config_name: mr
data_files:
- split: label
path: mr/label-*
- config_name: mrh
data_files:
- split: label
path: mrh/label-*
- config_name: mrj
data_files:
- split: label
path: mrj/label-*
- config_name: ms
data_files:
- split: label
path: ms/label-*
- config_name: ms-arab
data_files:
- split: label
path: ms-arab/label-*
- config_name: mt
data_files:
- split: label
path: mt/label-*
- config_name: mus
data_files:
- split: label
path: mus/label-*
- config_name: mwl
data_files:
- split: label
path: mwl/label-*
- config_name: my
data_files:
- split: label
path: my/label-*
- config_name: mzn
data_files:
- split: label
path: mzn/label-*
- config_name: na
data_files:
- split: label
path: na/label-*
- config_name: nah
data_files:
- split: label
path: nah/label-*
- config_name: nan-hani
data_files:
- split: label
path: nan-hani/label-*
- config_name: nap
data_files:
- split: label
path: nap/label-*
- config_name: nb
data_files:
- split: label
path: nb/label-*
- config_name: nds
data_files:
- split: label
path: nds/label-*
- config_name: nds-nl
data_files:
- split: label
path: nds-nl/label-*
- config_name: ne
data_files:
- split: label
path: ne/label-*
- config_name: new
data_files:
- split: label
path: new/label-*
- config_name: ng
data_files:
- split: label
path: ng/label-*
- config_name: nia
data_files:
- split: label
path: nia/label-*
- config_name: niu
data_files:
- split: label
path: niu/label-*
- config_name: nl
data_files:
- split: label
path: nl/label-*
- config_name: nn
data_files:
- split: label
path: nn/label-*
- config_name: 'no'
data_files:
- split: label
path: no/label-*
- config_name: nod
data_files:
- split: label
path: nod/label-*
- config_name: nov
data_files:
- split: label
path: nov/label-*
- config_name: nqo
data_files:
- split: label
path: nqo/label-*
- config_name: nrm
data_files:
- split: label
path: nrm/label-*
- config_name: nso
data_files:
- split: label
path: nso/label-*
- config_name: nv
data_files:
- split: label
path: nv/label-*
- config_name: ny
data_files:
- split: label
path: ny/label-*
- config_name: nys
data_files:
- split: label
path: nys/label-*
- config_name: oc
data_files:
- split: label
path: oc/label-*
- config_name: olo
data_files:
- split: label
path: olo/label-*
- config_name: om
data_files:
- split: label
path: om/label-*
- config_name: or
data_files:
- split: label
path: or/label-*
- config_name: os
data_files:
- split: label
path: os/label-*
- config_name: ota
data_files:
- split: label
path: ota/label-*
- config_name: pa
data_files:
- split: label
path: pa/label-*
- config_name: pam
data_files:
- split: label
path: pam/label-*
- config_name: pap
data_files:
- split: label
path: pap/label-*
- config_name: pcd
data_files:
- split: label
path: pcd/label-*
- config_name: pdc
data_files:
- split: label
path: pdc/label-*
- config_name: pdt
data_files:
- split: label
path: pdt/label-*
- config_name: pfl
data_files:
- split: label
path: pfl/label-*
- config_name: pi
data_files:
- split: label
path: pi/label-*
- config_name: pih
data_files:
- split: label
path: pih/label-*
- config_name: pl
data_files:
- split: label
path: pl/label-*
- config_name: pms
data_files:
- split: label
path: pms/label-*
- config_name: pnb
data_files:
- split: label
path: pnb/label-*
- config_name: pnt
data_files:
- split: label
path: pnt/label-*
- config_name: prg
data_files:
- split: label
path: prg/label-*
- config_name: ps
data_files:
- split: label
path: ps/label-*
- config_name: pt
data_files:
- split: label
path: pt/label-*
- config_name: pt-br
data_files:
- split: label
path: pt-br/label-*
- config_name: pwn
data_files:
- split: label
path: pwn/label-*
- config_name: qu
data_files:
- split: label
path: qu/label-*
- config_name: quc
data_files:
- split: label
path: quc/label-*
- config_name: qug
data_files:
- split: label
path: qug/label-*
- config_name: rgn
data_files:
- split: label
path: rgn/label-*
- config_name: rif
data_files:
- split: label
path: rif/label-*
- config_name: rm
data_files:
- split: label
path: rm/label-*
- config_name: rmc
data_files:
- split: label
path: rmc/label-*
- config_name: rmy
data_files:
- split: label
path: rmy/label-*
- config_name: rn
data_files:
- split: label
path: rn/label-*
- config_name: ro
data_files:
- split: label
path: ro/label-*
- config_name: roa-tara
data_files:
- split: label
path: roa-tara/label-*
- config_name: ru
data_files:
- split: label
path: ru/label-*
- config_name: rue
data_files:
- split: label
path: rue/label-*
- config_name: rup
data_files:
- split: label
path: rup/label-*
- config_name: ruq-cyrl
data_files:
- split: label
path: ruq-cyrl/label-*
- config_name: ruq-latn
data_files:
- split: label
path: ruq-latn/label-*
- config_name: rw
data_files:
- split: label
path: rw/label-*
- config_name: rwr
data_files:
- split: label
path: rwr/label-*
- config_name: ryu
data_files:
- split: label
path: ryu/label-*
- config_name: sa
data_files:
- split: label
path: sa/label-*
- config_name: sat
data_files:
- split: label
path: sat/label-*
- config_name: sc
data_files:
- split: label
path: sc/label-*
- config_name: scn
data_files:
- split: label
path: scn/label-*
- config_name: sco
data_files:
- split: label
path: sco/label-*
- config_name: sd
data_files:
- split: label
path: sd/label-*
- config_name: sdc
data_files:
- split: label
path: sdc/label-*
- config_name: se
data_files:
- split: label
path: se/label-*
- config_name: sei
data_files:
- split: label
path: sei/label-*
- config_name: sg
data_files:
- split: label
path: sg/label-*
- config_name: sh
data_files:
- split: label
path: sh/label-*
- config_name: shi-latn
data_files:
- split: label
path: shi-latn/label-*
- config_name: shi-tfng
data_files:
- split: label
path: shi-tfng/label-*
- config_name: shn
data_files:
- split: label
path: shn/label-*
- config_name: shy-latn
data_files:
- split: label
path: shy-latn/label-*
- config_name: si
data_files:
- split: label
path: si/label-*
- config_name: sjd
data_files:
- split: label
path: sjd/label-*
- config_name: sje
data_files:
- split: label
path: sje/label-*
- config_name: sju
data_files:
- split: label
path: sju/label-*
- config_name: sk
data_files:
- split: label
path: sk/label-*
- config_name: skr
data_files:
- split: label
path: skr/label-*
- config_name: sl
data_files:
- split: label
path: sl/label-*
- config_name: sli
data_files:
- split: label
path: sli/label-*
- config_name: sm
data_files:
- split: label
path: sm/label-*
- config_name: sma
data_files:
- split: label
path: sma/label-*
- config_name: smj
data_files:
- split: label
path: smj/label-*
- config_name: smn
data_files:
- split: label
path: smn/label-*
- config_name: sms
data_files:
- split: label
path: sms/label-*
- config_name: sn
data_files:
- split: label
path: sn/label-*
- config_name: sq
data_files:
- split: label
path: sq/label-*
- config_name: sr
data_files:
- split: label
path: sr/label-*
- config_name: sr-ec
data_files:
- split: label
path: sr-ec/label-*
- config_name: sr-el
data_files:
- split: label
path: sr-el/label-*
- config_name: srq
data_files:
- split: label
path: srq/label-*
- config_name: ss
data_files:
- split: label
path: ss/label-*
- config_name: st
data_files:
- split: label
path: st/label-*
- config_name: stq
data_files:
- split: label
path: stq/label-*
- config_name: su
data_files:
- split: label
path: su/label-*
- config_name: sv
data_files:
- split: label
path: sv/label-*
- config_name: sw
data_files:
- split: label
path: sw/label-*
- config_name: szl
data_files:
- split: label
path: szl/label-*
- config_name: szy
data_files:
- split: label
path: szy/label-*
- config_name: ta
data_files:
- split: label
path: ta/label-*
- config_name: tay
data_files:
- split: label
path: tay/label-*
- config_name: tcy
data_files:
- split: label
path: tcy/label-*
- config_name: te
data_files:
- split: label
path: te/label-*
- config_name: tet
data_files:
- split: label
path: tet/label-*
- config_name: tg
data_files:
- split: label
path: tg/label-*
- config_name: tg-cyrl
data_files:
- split: label
path: tg-cyrl/label-*
- config_name: tg-latn
data_files:
- split: label
path: tg-latn/label-*
- config_name: th
data_files:
- split: label
path: th/label-*
- config_name: ti
data_files:
- split: label
path: ti/label-*
- config_name: tk
data_files:
- split: label
path: tk/label-*
- config_name: tl
data_files:
- split: label
path: tl/label-*
- config_name: tly
data_files:
- split: label
path: tly/label-*
- config_name: tly-cyrl
data_files:
- split: label
path: tly-cyrl/label-*
- config_name: tn
data_files:
- split: label
path: tn/label-*
- config_name: to
data_files:
- split: label
path: to/label-*
- config_name: tpi
data_files:
- split: label
path: tpi/label-*
- config_name: tr
data_files:
- split: label
path: tr/label-*
- config_name: tru
data_files:
- split: label
path: tru/label-*
- config_name: trv
data_files:
- split: label
path: trv/label-*
- config_name: ts
data_files:
- split: label
path: ts/label-*
- config_name: tt
data_files:
- split: label
path: tt/label-*
- config_name: tt-cyrl
data_files:
- split: label
path: tt-cyrl/label-*
- config_name: tt-latn
data_files:
- split: label
path: tt-latn/label-*
- config_name: tum
data_files:
- split: label
path: tum/label-*
- config_name: tw
data_files:
- split: label
path: tw/label-*
- config_name: ty
data_files:
- split: label
path: ty/label-*
- config_name: tyv
data_files:
- split: label
path: tyv/label-*
- config_name: tzm
data_files:
- split: label
path: tzm/label-*
- config_name: udm
data_files:
- split: label
path: udm/label-*
- config_name: ug-arab
data_files:
- split: label
path: ug-arab/label-*
- config_name: ug-latn
data_files:
- split: label
path: ug-latn/label-*
- config_name: uk
data_files:
- split: label
path: uk/label-*
- config_name: ur
data_files:
- split: label
path: ur/label-*
- config_name: uz
data_files:
- split: label
path: uz/label-*
- config_name: uz-cyrl
data_files:
- split: label
path: uz-cyrl/label-*
- config_name: ve
data_files:
- split: label
path: ve/label-*
- config_name: vec
data_files:
- split: label
path: vec/label-*
- config_name: vep
data_files:
- split: label
path: vep/label-*
- config_name: vi
data_files:
- split: label
path: vi/label-*
- config_name: vls
data_files:
- split: label
path: vls/label-*
- config_name: vmf
data_files:
- split: label
path: vmf/label-*
- config_name: vo
data_files:
- split: label
path: vo/label-*
- config_name: vot
data_files:
- split: label
path: vot/label-*
- config_name: wa
data_files:
- split: label
path: wa/label-*
- config_name: war
data_files:
- split: label
path: war/label-*
- config_name: wls
data_files:
- split: label
path: wls/label-*
- config_name: wo
data_files:
- split: label
path: wo/label-*
- config_name: wuu
data_files:
- split: label
path: wuu/label-*
- config_name: wya
data_files:
- split: label
path: wya/label-*
- config_name: xal
data_files:
- split: label
path: xal/label-*
- config_name: xh
data_files:
- split: label
path: xh/label-*
- config_name: xmf
data_files:
- split: label
path: xmf/label-*
- config_name: xsy
data_files:
- split: label
path: xsy/label-*
- config_name: yav
data_files:
- split: label
path: yav/label-*
- config_name: yi
data_files:
- split: label
path: yi/label-*
- config_name: yo
data_files:
- split: label
path: yo/label-*
- config_name: za
data_files:
- split: label
path: za/label-*
- config_name: zea
data_files:
- split: label
path: zea/label-*
- config_name: zgh
data_files:
- split: label
path: zgh/label-*
- config_name: zh
data_files:
- split: label
path: zh/label-*
- config_name: zh-cn
data_files:
- split: label
path: zh-cn/label-*
- config_name: zh-hans
data_files:
- split: label
path: zh-hans/label-*
- config_name: zh-hant
data_files:
- split: label
path: zh-hant/label-*
- config_name: zh-hk
data_files:
- split: label
path: zh-hk/label-*
- config_name: zh-mo
data_files:
- split: label
path: zh-mo/label-*
- config_name: zh-my
data_files:
- split: label
path: zh-my/label-*
- config_name: zh-sg
data_files:
- split: label
path: zh-sg/label-*
- config_name: zh-tw
data_files:
- split: label
path: zh-tw/label-*
- config_name: zh-yue
data_files:
- split: label
path: zh-yue/label-*
- config_name: zu
data_files:
- split: label
path: zu/label-*
task_categories:
- translation
- text2text-generation
language:
- en
- fr
- de
- ja
- zh
- hi
- ar
- bn
- ru
- es
---
# Wikidata Labels
Large parallel corpus for machine translation
- Entity label data extracted from Wikidata (2022-01-03), filtered for item entities only
- Only download the languages you need with `datasets>=2.14.0`
- Similar dataset: https://huggingface.co/datasets/wmt/wikititles (18 Wikipedia titles pairs instead of all Wikidata entities)
## Dataset Details
### Dataset Sources
- Wikidata JSON dump (wikidata-20220103-all.json.gz) https://www.wikidata.org/wiki/Wikidata:Database_download
## Uses
You can generate parallel text examples from this dataset like below:
```python
from datasets import load_dataset
import pandas as pd
def parallel_labels(lang_codes: list, how="inner", repo_id="rayliuca/wikidata_entity_label", merge_config={}, datasets_config={}) -> pd.DataFrame:
out_df = None
for lc in lang_codes:
dataset = load_dataset(repo_id, lc, **datasets_config)
dataset_df = dataset['label'].to_pandas().rename(columns={"label":lc}).drop(columns=['lastrevid'])
if out_df is None:
out_df = dataset_df
else:
out_df = out_df.merge(
dataset_df,
on='wikidata_id',
how=how,
**merge_config
)
return out_df
# Note: the "en" subset is >4GB
parallel_labels(['en', 'fr', 'ja', 'zh']).head()
```
### Output
| | wikidata_id | en | fr | ja | zh |
|---:|:--------------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------|:---------------------------------------------|
| 0 | Q109739412 | SARS-CoV-2 Omicron variant | variant Omicron du SARS-CoV-2 | SARSコロナウイルス2-オミクロン株 | 嚴重急性呼吸道症候群冠狀病毒2型Omicron變異株 |
| 1 | Q108460606 | Ulughbegsaurus | Ulughbegsaurus | ウルグベグサウルス | 兀魯伯龍屬 |
| 2 | Q108556886 | AUKUS | AUKUS | AUKUS | AUKUS |
| 3 | Q106496152 | Claude Joseph | Claude Joseph | クロード・ジョゼフ | 克洛德·约瑟夫 |
| 4 | Q105519361 | The World's Finest Assassin Gets Reincarnated in a Different World as an Aristocrat | The World's Finest Assassin Gets Reincarnated in Another World as an Aristocrat | 世界最高の暗殺者、異世界貴族に転生する | 世界頂尖的暗殺者轉生為異世界貴族 |
Note: this example table above shows a quirk(?) of the Wiki data. The French Wikipedia page [The World's Finest Assassin Gets Reincarnated in Another World as an Aristocrat](https://fr.wikipedia.org/wiki/The_World%27s_Finest_Assassin_Gets_Reincarnated_in_Another_World_as_an_Aristocrat) uses English for its title. While this could be disadvantageous for direct translation training, it also provides insights into how native speakers might call this entity instead of the literal translation on the Wiki page as well
## Dataset Structure
Each language has its own subset (aka config), which means you only have to download the languages you need with `datasets>=2.14.0`
Each subset has these fields:
- wikidata_id
- lastrevid
- label
## Dataset Creation
#### Data Collection and Processing
- Filtered for item entities only
- Ignored the descriptions as those texts are not very parallel
## Bias, Risks, and Limitations
- Might be slightly outdated (2022)
- Popular languages have more entries
- Labels are not guaranteed to be literal translations (see examples above)
提供机构:
rayliuca
原始信息汇总
数据集概述
该数据集包含多个配置,每个配置对应不同的语言或方言。每个配置包含以下特征和分割信息:
特征
wikidata_id: 字符串类型lastrevid: 64位整数类型label: 字符串类型
分割信息
label: 包含字节数和示例数
配置详情
配置 aa
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 13986211
- 示例数: 436895
- 下载大小: 9821312
- 数据集大小: 13986211
配置 ab
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 5012532
- 示例数: 159908
- 下载大小: 3013706
- 数据集大小: 5012532
配置 abs
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4252728
- 示例数: 143986
- 下载大小: 2567450
- 数据集大小: 4252728
配置 ace
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 19105673
- 示例数: 574712
- 下载大小: 13573374
- 数据集大小: 19105673
配置 ady
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4444259
- 示例数: 148627
- 下载大小: 2705754
- 数据集大小: 4444259
配置 ady-cyrl
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4412556
- 示例数: 147884
- 下载大小: 2682170
- 数据集大小: 4412556
配置 aeb
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4305734
- 示例数: 145198
- 下载大小: 2606368
- 数据集大小: 4305734
配置 aeb-arab
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4467930
- 示例数: 148796
- 下载大小: 2722169
- 数据集大小: 4467930
配置 aeb-latn
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 12770359
- 示例数: 404946
- 下载大小: 8886489
- 数据集大小: 12770359
配置 af
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 58561042
- 示例数: 1643153
- 下载大小: 42539052
- 数据集大小: 58561042
配置 agq
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 1317
- 示例数: 33
- 下载大小: 2906
- 数据集大小: 1317
配置 ak
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 14198715
- 示例数: 443037
- 下载大小: 9991525
- 数据集大小: 14198715
配置 aln
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 13811116
- 示例数: 432089
- 下载大小: 9673418
- 数据集大小: 13811116
配置 als
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 20691
- 示例数: 543
- 下载大小: 17540
- 数据集大小: 20691
配置 alt
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 108390
- 示例数: 1814
- 下载大小: 59046
- 数据集大小: 108390
配置 am
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 5231176
- 示例数: 163038
- 下载大小: 3187164
- 数据集大小: 5231176
配置 ami
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 21519
- 示例数: 686
- 下载大小: 16640
- 数据集大小: 21519
配置 an
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 240345072
- 示例数: 5921087
- 下载大小: 164895205
- 数据集大小: 240345072
配置 ang
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 14275715
- 示例数: 443461
- 下载大小: 10063758
- 数据集大小: 14275715
配置 anp
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 8558258
- 示例数: 241612
- 下载大小: 4381360
- 数据集大小: 8558258
配置 ar
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 291173732
- 示例数: 5724064
- 下载大小: 159369497
- 数据集大小: 291173732
配置 arc
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4473283
- 示例数: 150006
- 下载大小: 2722619
- 数据集大小: 4473283
配置 arn
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 13879729
- 示例数: 433912
- 下载大小: 9715431
- 数据集大小: 13879729
配置 arq
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4346991
- 示例数: 146004
- 下载大小: 2636972
- 数据集大小: 4346991
配置 ary
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 5358568
- 示例数: 171568
- 下载大小: 3313402
- 数据集大小: 5358568
配置 arz
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 81806333
- 示例数: 1669699
- 下载大小: 49423508
- 数据集大小: 81806333
配置 as
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 21658610
- 示例数: 450074
- 下载大小: 9641626
- 数据集大小: 21658610
配置 ase
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4252943
- 示例数: 143986
- 下载大小: 2568106
- 数据集大小: 4252943
配置 ast
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 1385628786
- 示例数: 20696237
- 下载大小: 955908362
- 数据集大小: 1385628786
配置 atj
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 12996229
- 示例数: 411639
- 下载大小: 9057557
- 数据集大小: 12996229
配置 av
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 4722934
- 示例数: 153781
- 下载大小: 2880103
- 数据集大小: 4722934
配置 avk
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 13194485
- 示例数: 414598
- 下载大小: 9200917
- 数据集大小: 13194485
配置 awa
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 8599312
- 示例数: 242320
- 下载大小: 4411751
- 数据集大小: 8599312
配置 ay
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 14269432
- 示例数: 443521
- 下载大小: 10029939
- 数据集大小: 14269432
配置 az
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 21049248
- 示例数: 516732
- 下载大小: 14117527
- 数据集大小: 21049248
配置 azb
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 30781587
- 示例数: 607562
- 下载大小: 16028687
- 数据集大小: 30781587
配置 ba
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 11525351
- 示例数: 261509
- 下载大小: 6733777
- 数据集大小: 11525351
配置 ban
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 13674052
- 示例数: 426706
- 下载大小: 9513747
- 数据集大小: 13674052
配置 ban-bali
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 50961
- 示例数: 748
- 下载大小: 25817
- 数据集大小: 50961
配置 bar
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 54783034
- 示例数: 1566120
- 下载大小: 40389830
- 数据集大小: 54783034
配置 bbc
- 特征:
wikidata_id,lastrevid,label - 分割:
label- 字节数: 12820895
- 示例数: 406960
- 下载大小: 8917054
- 数据集大小: 12820895



