MihaiPopa-1/OmniSurgical-1.0
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/MihaiPopa-1/OmniSurgical-1.0
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- abk
- abq
- abs
- acm
- adh
- adi
- ady
- aeb
- afr
- agx
- aii
- aim
- ain
- ajz
- akb
- aln
- als
- alt
- amh
- anp
- aoz
- apc
- apt
- arb
- arg
- arq
- ars
- ary
- arz
- asm
- ast
- atb
- ava
- awa
- ayp
- ayr
- azb
- azj
- bak
- bam
- ban
- bar
- bas
- bbc
- bbk
- bcl
- bdq
- bel
- ben
- bew
- bho
- bhp
- bis
- biu
- bjn
- bod
- bos
- brh
- brx
- bts
- btx
- bug
- bul
- bwi
- bxr
- cat
- cbk
- ccp
- ceb
- ces
- cfm
- cha
- che
- chr
- chu
- chv
- cjs
- ckb
- ckt
- cmn
- cnh
- cnw
- cos
- crh
- crj
- crk
- crl
- crs
- csb
- csw
- csy
- ctd
- cym
- czt
- dak
- dan
- dar
- deu
- dik
- diu
- div
- dje
- dks
- dln
- dng
- dnw
- doi
- dru
- dsb
- dtp
- dty
- dzo
- ekk
- ell
- enl
- enm
- epo
- ess
- eus
- eve
- ewo
- ext
- fao
- fas
- ffm
- fij
- fil
- fin
- fit
- fkv
- fmu
- fra
- fro
- frp
- fry
- fuf
- fur
- fuv
- gag
- gaz
- gcf
- gla
- gle
- glg
- glk
- glv
- gmh
- gnb
- goh
- gom
- gos
- grc
- gsw
- gug
- guj
- guz
- hac
- hae
- hak
- hat
- hau
- haw
- hbo
- heb
- her
- hif
- hil
- hin
- hmr
- hne
- hns
- hrv
- hrx
- hsb
- hun
- hwc
- hye
- hyw
- iba
- ibg
- ibo
- ife
- ike
- ikt
- ilo
- ina
- ind
- inh
- isl
- ita
- ivv
- jav
- jpn
- jun
- kaa
- kab
- kac
- kak
- kal
- kam
- kan
- kas
- kat
- kaz
- kbd
- kca
- kdh
- kdr
- kea
- kei
- kgp
- kha
- khk
- khm
- kik
- kin
- kir
- kiu
- kjb
- kjh
- kmr
- knc
- koi
- kor
- kos
- kpv
- krj
- krl
- kru
- ksh
- ksw
- ktj
- ktz
- kua
- kum
- kwn
- kyu
- kzj
- lad
- lao
- lat
- lbe
- ldn
- lew
- lez
- lfn
- lim
- lin
- lis
- lit
- lki
- lld
- lmk
- lnd
- lrc
- ltg
- ltz
- lud
- lug
- luo
- lus
- lvs
- lwg
- lzh
- mag
- mah
- mai
- mak
- mal
- mar
- mas
- mbf
- mdf
- mer
- mfe
- mfg
- mfy
- mhi
- mhr
- mhy
- min
- mip
- mjw
- mkd
- mlt
- mni
- mnk
- mns
- mnw
- moh
- mph
- mqy
- mri
- mrj
- mrw
- mtg
- mui
- mup
- mus
- mvp
- mwf
- mwl
- mww
- mya
- myv
- myx
- mzh
- nah
- nan
- nap
- naq
- nbu
- nde
- ndo
- nds
- new
- nio
- njn
- njo
- nld
- nmf
- nmz
- nno
- nob
- nog
- non
- npi
- npo
- nrf
- nri
- nrm
- nse
- nus
- nya
- nyn
- nzm
- obo
- oci
- ojb
- olo
- orv
- ory
- oss
- ota
- oto
- otw
- pam
- pan
- pap
- pbt
- pcd
- pck
- pcm
- pfl
- plt
- pmq
- pmx
- pnb
- pnt
- pol
- por
- pov
- ppk
- pps
- prg
- pui
- pxm
- quc
- qul
- qup
- qus
- quz
- raw
- rcf
- rel
- rhg
- ria
- rjs
- rmc
- rml
- rmn
- rmy
- rnl
- roh
- ron
- rtm
- rue
- run
- rus
- sah
- san
- sat
- sck
- scn
- sda
- sdc
- sdh
- ses
- sgc
- sgh
- sid
- sin
- sju
- skr
- slk
- slv
- sma
- sme
- smj
- smn
- smo
- sms
- smt
- sna
- snd
- som
- sot
- spa
- srd
- srp
- ssw
- sun
- swe
- swg
- swh
- syc
- syl
- szl
- tab
- tam
- taq
- tat
- tcy
- tcz
- tel
- tet
- tgk
- tha
- thl
- tig
- tir
- tkl
- tkr
- tlh
- tly
- tok
- ton
- tpi
- tpw
- trc
- trp
- trs
- ttj
- tuk
- tur
- tuv
- twx
- tyv
- tzl
- tzm
- udm
- uig
- ukr
- urd
- uzn
- uzs
- vap
- vie
- vot
- vro
- war
- way
- wba
- wbm
- wes
- whk
- wlx
- wol
- wsg
- wwa
- xal
- xho
- xmm
- xmv
- xog
- yaz
- ydd
- yor
- yrk
- yrl
- yua
- yue
- zea
- zgh
- zom
- zsm
- zul
task_categories:
- text-generation
- translation
datasets:
- HuggingFaceFW/finetranslations
license: apache-2.0
size_categories:
- 10K<n<100K
---
# OmniSurgical 1.0
OmniSurgical is a dataset which you can train your very own massively multilingual machine translation models by fine-tuning existing LLMs!
# Formats
We give the dataset in 2 formats: JSONL and JSONZ (zipped JSONL)
And the names speak for themselves: `OmniSurgical_120_Clean.jsonz` is the processed file and `train.jsonz` is the shuffled version of the same file, used to fine-tune existing LLMs (I fine-tuned Qwen 3 0.6B for this!)
# Data Used
I used only 120 sentences per each language of [HF's FineTranslations](https://huggingface.co/datasets/HuggingFaceFW/finetranslations) dataset, that means 60 sentences per language pair!
The original dataset was translated back into English using [Gemma 3 27B](https://huggingface.co/google/gemma-3-27b-it)
提供机构:
MihaiPopa-1



