five

The CLEF-IP 2009 Test Collection

收藏
DataCite Commons2024-08-27 更新2024-07-13 收录
下载链接:
https://researchdata.tuwien.ac.at/records/a2svx-p1y38
下载链接
链接失效反馈
官方服务:
资源简介:
CLEF-IP: Cross-Language Evaluation Forum - Intellectual Property The CLEF-IP track was launched in 2009 to investigate IR techniques for patent retrieval and it is part of the CLEF 2009 evaluation campaign.The track utilizes a collection of more than 1M patent documents derived from EPO (European Patent Office) sources. The collection contains documents in English, French and German with at least 100,000 documents in each language. The task is to find patent documents that constitute prior art. The topics are complete patent documents that participants can process to extract queries. In addition to the Main task, CLEF-IP 2009 provided three language tasks (English, German, French) where topics were in one of these three languages. Relevance judgements were produced by two methods: automatically, using patent citations from seed patents; and manual for a small number of queries for which search results will be reviewed by Intellectual Property Experts. Files Document CollectionThe CLEF-IP 2009 collection of documents consists of XML files. There are 1,9 million XML files, corresponding to approximately 1 million individual patents filed between 1985 and 2000. A dtd file for the XML format is provided as well. Topics and Answers (Qrels)Both the training and the test topic sets contain also the relevance assessments for the topics. For each task of the CLEF-IP 09 track, we provide 4 sets of different sizes of topic test sets: XLarge, Large, Medium, Small. GuidelinesContains detailed explanation on how to work with the four tasks from the corpus.
提供机构:
TU Wien
创建时间:
2021-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作