five

Sequence Tagging of FA-KES Dataset

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/3534273
下载链接
链接失效反馈
官方服务:
资源简介:
We used the BIOE Sequence Tagging strategy which was utilized in OpenTag [1] in the aim of getting every word in the dataset associated with a label called ‘tag’. A tag consists of one of these letters B, I, O, or E, that stand respectively for beginning, inside, outside, or end of an attribute, followed by a ‘-’ sign, followed by three letters that represent the type of information that was initially extracted. Tokens were labeled with one of the following tags: ‘B-LOC’, ‘I-LOC’, ‘E-LOC’, ‘B-CIV’, ‘I-CIV’, ‘E-CIV’, ‘B-NCV’, ‘I-NCV’, ‘E-NCV’, ‘B-WMN’, ‘I-WMN’, ‘E-WMN’, ‘B-CHD’, ‘I-CHD’, ‘E-CHD’, ‘B-ACT’, ‘I-ACT’,‘E-ACT’, ‘B-COD’, ‘I-COD’, ‘E-COD’, ‘B-DAT’, ‘I-DAT’, ‘E-DAT’, or ‘O’ (where O stands for words outside the scope, LOC for the incident location, CIV for the number of civilians dead, NCV for the number of non-civilians dead, WMN for the number of women targeted, CHD for the number of children killed, ACT for actor/authority responsible for the incident, COD for the cause of death, and DAT for date of incident). This was done by creating a parser that would automatically tag each word with the appropriate tag. We created three subsets of the FA-KES dataset. The first one consists of the articles' titles, the second one of the articles' titles concatenated with the articles' first paragraphs, and the third one consists of the articles' titles along with their contents. The first column in each of the three CSV files represents the article number in the dataset, the second column contains the sequence of words for each article and the third one holds the tags linked to the tokens of the previous column. [1]: G. Zheng, S. Mukherjee, X. L. Dong, and F. Li, “Opentag: Open attribute value extraction from product profiles,” CoRR, vol. abs/1806.01264, 2018.
创建时间:
2020-10-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作