wjbmattingly/gliner-bird-diet-synthetic
收藏GLiNER Bird Diet Synthetic Dataset
概述
- 任务类别: 命名实体识别 (NER)
- 语言: 英语
- 标签: 科学, 鸟类, 鸟类学, NER
- 数据规模: 1K<n<10K
数据集描述
- 文件名:
train.jsonl,eval.jsonl,test.jsonl - 字段:
ner: 从鸟类饮食描述中提取的实体,使用GLiNER模型。tokenized_text: 详细描述鸟类饮食习惯和模式的文本内容,使用qwew/qwen2-7b-instruct模型合成。
NER标签定义
plant food: 识别文本中提到的特定植物性食物。animal food: 分类提到动物性食物的提及。group behavior: 描述与进食相关的任何社会或群体行为。group species: 记录参与与鸟类进食行为的其他物种。eating time: 指定进食通常发生的时间。eating location: 确定进食活动的地理或环境位置。
示例标注
json { "tokenized_text": ["Surviving", "in", "the", "vast", "grasslands", "of", "the", "Serengeti", ",", "the", "Lappet", "-", "faced", "Vulture", "primarily", "consumes", "carrion", ",", "focusing", "on", "remains", "of", "wildebeest", "and", "zebra", ".", "It", "occasionally", "forms", "a", "group", "with", "vultures", "of", "various", "species", "to", "efficiently", "locate", "and", "guard", "carcasses", ".", "The", "vulture", "is", "an", "active", "hunter", "during", "the", "daytime", ",", "specifically", "targeting", "amphibians", "and", "small", "rodents", ".", "Its", "feeding", "habits", "are", "predominantly", "observed", "in", "the", "open", "plains", "of", "Africa", "."], "ner": [[51,51,"EATING TIME"],[55,55,"ANIMAL FOOD"],[57,58,"ANIMAL FOOD"],[68,71,"EATING LOCATION"]] }



