Arabic and English Spatial Entity Dataset for Animal Disease Surveillance Extracted with PADI-web
收藏DataCite Commons2026-02-12 更新2025-05-10 收录
下载链接:
https://dataverse.cirad.fr/citation?persistentId=doi:10.18167/DVN1/ZA1YUO
下载链接
链接失效反馈官方服务:
资源简介:
As part of the “Arabic Corpus and Entities Dealing with Animal Disease Surveillance Extracted with PADI-web” dataset (<a href="https://doi.org/10.18167/DVN1/2B4WLR">https://doi.org/10.18167/DVN1/2B4WLR</a>), we built a new dataset containing 328 spatial entities in Arabic (275 absolute spatial entities and 53 relative spatial entities), their translation into English (manually validated) and their automatic translation by three automatic tools (DeepL, Microsoft Azure, and Reverso). The dataset was updated with two new columns in September 2025 (GeoNames ID and GeoNames Feature Class, enabling the matching of spatial entities to the GeoNames gazetteer) and in February 2026 to increase the amount of relative spatial entities.
<br><br>
The dataset is organised as a table with twelve columns :
<br><br>
<ul>
<li><strong>ID</strong>: The unique identifier of each article (from PADI-web database)</li>
<li><strong>Arabic Location</strong>: The spatial entities in Arabic, manually extracted from 53 articles collected via PADI-web</li>
<li><strong>English Location</strong>: The manual translation of spatial entities into English, based on existing field sources such as Google Maps and the GeoNames database</li>
<li><strong>GeoNames ID</strong>: The unique ID from the GeoNames database (2022 version of GeoNames: https://www.geonames.org/) corresponding to each spatial entity (empty if no match in GeoNames)</li>
<li><strong>GeoNames Feature Class </strong>: The feature class corresponding to the GeoNames ID (empty if no match in GeoNames)</li>
<li><strong>Type</strong>: A manually assigned type of spatial entity (country, city, region, village, etc.).</li>
<li><strong>Category</strong>: The classification of spatial entities into two categories: absolute spatial entities (ASE) and relative spatial entities (RSE).</li>
<li><strong>Arabic Phrases</strong>: The sentence, in Arabic, from which the spatial entity was extracted.</li>
<li><strong>Translation DeepL</strong>: The translation of the location by DeepL.</li>
<li><strong>Translation Microsoft Azure</strong>: The translation of the location by Microsoft Azure.</li>
<li><strong>Translation Reverso</strong>: The translation of the location by Reverso.</li>
<li><strong>English Sentences Translated by DeepL</strong>: The translation of the sentence by DeepL.</li>
<li><strong>English Sentences Translated by Microsoft Azure</strong>: The translation of the sentence by Microsoft Azure.</li>
<li><strong>English Sentences Translated by Reverso</strong>: The translation of the sentence by Reverso.</li>
</ul>
<br><br>
<strong>Absolute spatial</strong> entities are direct references to precise, locatable geographic spaces, i.e. entities that can be located on a map or in a geographic database (e.g. cities such as Safi, countries such as Morocco, Egypt, etc.).
<br>
<strong>Relative spatial entities</strong> are entities defined in relation to at least one other spatial entity, using spatial indicators of a topological nature (for example, “الطود شرق” (El-Tod East), “ناحية تلات” (Talat district), etc.).
提供机构:
CIRAD Dataverse
创建时间:
2025-04-08



