Brazilian Transparency Portal Contracts and Procurements Crawled from the public API
收藏DataCite Commons2020-08-25 更新2024-08-17 收录
下载链接:
https://figshare.com/articles/Brazilian_Transparency_Portal_Contracts_and_Procurements_Crawled_from_the_public_API/11888916/1
下载链接
链接失效反馈官方服务:
资源简介:
This is a crawl of the Web API provided by the Brazilian Federal Government for open budget data under the transparency portal (http://www.portaltransparencia.gov.br).<br>The datasets included in this crawl are the procurement (licitações), contracts (contratos) and government organizations (Organizações SIAFI). An introduction to the APIs is provided (http://www.portaltransparencia.gov.br/api-de-dados) and there is a swagger documentation available at http://www.transparencia.gov.br/swagger-ui.html.<br>Two additional undocumented APIs were also crawled:<br>- (/criterios/contratos/fornecedor/autocomplete) Surrogate IDs from CNPJ (a fiscal organization identifier);- (/pessoa-juridica/{id}/participante-licitacao/resultado) Participation of contractors in procurements (contractors are identified by their surrogate ID, not their CNPJ).<br>These undocumented APIs where only crawled for contractors that had contracts with organization number 26246 (Federal University of Santa Catarina).<br><br>The crawl includes data up to January 31st, 2020. The aforementioned datasets are updated monthly. <br><br>Software used to perform this crawl can be found at https://bitbucket.org/alexishuf/compsac-2020-experiments. This crawl-all.sh script does the full crawl (this requires 4 hours or more). More details of the crawling procedures can be found in the EXPERIMENTS.md file.<br>
本数据集为针对巴西联邦政府透明度门户(http://www.portaltransparencia.gov.br)所提供的公开预算数据Web应用程序编程接口(Web API)的爬取结果。
本次爬取涵盖的数据集包含招投标(licitações)、合同(contratos)以及SIAFI政府机构(Organizações SIAFI)三类。该类API的说明文档可通过http://www.portaltransparencia.gov.br/api-de-dados查阅,同时可通过http://www.transparencia.gov.br/swagger-ui.html获取Swagger接口文档。
本次爬取还覆盖了另外两个未公开的API:
- (/criterios/contratos/fornecedor/autocomplete):用于提取巴西国家法人税号(CNPJ,财政机构标识符)对应的替代标识符;
- (/pessoa-juridica/{id}/participante-licitacao/resultado):用于查询承包商在招投标活动中的参与情况(此处承包商以其替代标识符进行标识,而非CNPJ)。
上述未公开API仅针对与机构编号26246(圣卡塔琳娜联邦大学)签订过合同的承包商进行爬取。
本次爬取的数据覆盖至2020年1月31日,前述数据集均按月更新。
本次爬取所用的软件代码可在https://bitbucket.org/alexishuf/compsac-2020-experiments获取,其中crawl-all.sh脚本可执行完整爬取流程(该过程耗时约4小时及以上)。爬取流程的更多细节可查阅项目中的EXPERIMENTS.md文件。
提供机构:
figshare
创建时间:
2020-04-05



