five

Labeled Dataset of Wikidata Edit History Changes

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19764414
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains manually labeled changes extracted from Wikidata's edit history, used to train and evaluate a classifier for change type classification (See ML-based Change Type Classification in Wikidata). Note that we provide 2 files: wikidata_edit_history_labeled_changes.csv: contains labeled changes only for the datatypes quantity, time, entity, string wikidata_edit_history_labeled_changes_globecoordinate.csv: contains labeled changes only for the datatype globecoordinate. In this case, we labeled latitude and longitude changes separately; therefore, there are 2 label columns, one for latitude and one for longitude (label_latitude, label_longitude) Each row corresponds to a single change and includes the following columns: Column Datatype Description revision_id bigint Wikidata's revision id of the change entity_id int Numeric part of the Q-id of the entity entity_label string Label of the entity value_id string Identifier of the statement value. A property can have multiple values. change_target string Can be "rank" (change in the rank), "" (change in a value) or a language code (e.g., "en"). The latter corresponds to the language for a multilingual text.  property_id int Numeric part of the P-id of the property property_label string Label of the property old_value string (json) Old value for the statement/rank old_value_label string Entity label of the old value. Only applicable for changes where datatype in (wikibase-item, wikibase-entityid, wikibase-property, wikibase-lexeme, wikibase-sense, wikibase-form) new_value string (json) New value for the statement/rank new_value_label string Entity label of the new value. Only applicable for changes where datatype in (wikibase-item, wikibase-entityid, wikibase-property, wikibase-lexeme, wikibase-sense, wikibase-form) datatype string Datatype of the values changing. Can be one of: quantity, time, globecoordinate,  monolingualtext, string, external-id, url, commonsMedia, geo-shape, tabular-data, math, musical-notation, wikibase-item, wikibase-entityid, wikibase-property, wikibase-lexeme, wikibase-sense, wikibase-form. action string Edit type. Can be "UPDATE", "DELETE" or "CREATE" target string Target of the change. Can be "PROPERTY_VALUE" entity_types_31 string List of entity labels of the values for the property instance of (P31) for the entity suffering the change. entity_description string Description of the entity suffering the change new_value_description string Description of the new value. Only applicable for changes where datatype in (wikibase-item, wikibase-entityid, wikibase-property, wikibase-lexeme, wikibase-sense, wikibase-form) old_value_description string Description of the old value. Only applicable for changes where datatype in (wikibase-item, wikibase-entityid, wikibase-property, wikibase-lexeme, wikibase-sense, wikibase-form) entity_types_279 string List of entity labels of the values for the property subclass of (P279) for the entity suffering the change. label string Label of the change type   Labels correspond to the change types defined below: Label Change Type Description refinement Refinement a property value is replaced by a more specific or precise value, without changing the statement's meaning. The refinement may add more contextual information or rephrase a text to convey the same meaning more clearly, increase numerical precision, or provide a more specific classification while remaining semantically compatible with the original value. unrefinement Unrefinement a property value is replaced by a less specific or precise value, without changing the statement’s meaning. The unrefinement may remove contextual information, decrease numerical precision, or generalize to a broader classification while remaining semantically compatible with the original value. textual change Textual change a property value of type text is modified to correct or introduce language errors, such as spelling, typos, or grammar, without altering sentence structure or the statement's meaning. link_change Link change an entity reference is replaced by another one with a similar or identical label but representing a different concept.  re_formatting Re-formatting a property value's representation is modified on a surface-level, without altering its underlying meaning. This change type can vary depending on the datatype. For text values, re-formatting covers changes to visual presentation, such as spacing, capitalization, hyphenation, and other typographical elements.  For quantity, re-formatting covers changes in numerical precision that do not alter the value (e.g., adding or removing trailing zeros) property_value_update Value update a property value is replaced with a semantically different value, altering the statement's meaning. This includes corrections of incorrect values and updates reflecting real-world changes. Additionally, for time, quantity and globecoordinate changes, this also includes sign changes (e.g., -1 -> +1)
提供机构:
Zenodo
创建时间:
2026-04-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作