five

PrevDistro - Preverb Distributions in Hungarian

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6349409
下载链接
链接失效反馈
官方服务:
资源简介:
PrevDistro (Preverb Distributions) is an open-source dataset containing 41.5 million corpus occurrences of 49 preverb-verb construction types. It consists of the following columns: 1 sid: ID 2 constype: construction type 3 subtype: construction subtype 4 prevpos: preverb position 5 prev: preverb 6 verb: verb lemma 7 intervening: intervening words (as lemmas) 8 actform: actual form (the same content as in column 10, but this column is lowercase) 9 left: left context 10 kwic: keyword in context 11 right: right context 12 docid: document ID from the Hungarian Gigaword Corpus 13 title: document title 14 style: document style (e.g. official, press, ...) 15 region: document region (e.g. Transylvania, Subcarpathia, ...) 16 year: year of publication (sometimes several years can be found in one document) The first row stands for the header. If a cell's value is unspecified, it is marked with underscore (_).
创建时间:
2022-03-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作