BreadboardLabs/CurioTreeData
收藏Hugging Face2023-11-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/BreadboardLabs/CurioTreeData
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
tags:
- climate
- trees
- images
size_categories:
- 1M<n<10M
---
# The Curio Tree Dataset
This dataset contains much of the tree inventory, images and stories data that was collected on the [Curio platform](https://www.youtube.com/@curio-xyz7991/videos) before it was sunset. The data was extraced from a number of database tables and includes;
- The inventory details of 2.5 millions trees from locations across the globe (location, species, diameter at breast height (DBH), height, vitality etc, where available)
- 27,288 images of trees that were uploaded onto the platform by our community and linked to individual trees and their species information etc.
- Notes (stories), tags and conversations linked to trees.
### Dataset Description
Curio was an environmental education and outreach platform that was predominantly focused on urban forestry. It connected the various stakeholders involved in the management of urban forestry with the public and importantly made all data uploaded via its web and mobile apps publicly available. The platform was live from March 2016 until August 2023 when the maintainence overheads made its ongoing availability infeasible. Curio was supported in its early stages by two European Space Agency projects, through the [New Commons](https://business.esa.int/projects/new-commons) and [Curio Canopy](https://business.esa.int/projects/curio-canopy). A sense of the platform and how it worked can be found via the videos on its supporting [youtube channel](https://www.youtube.com/@curio-xyz7991/videos)
This repository contains much of the tree inventory, images and stories data that was collected on the platform via our community, projects we helped support and open data tree inventories we uploaded onto the platform. We are keen to make this data available for research purposes in the hope it might be of benefit to others and to further the efforts of our community.
We have endeavored to name as many of those great projects and data sources that were hosted on the Curio platform in the attribution section below. If there are any omissions or errors please contact us.
A related project involved generating a high resolution map of tree canopy cover for the Greater London Authority. Details of that project and dataset can be found on the [London Datastore Curio Canopy page](https://data.london.gov.uk/dataset/curio-canopy).
- **Curated by:** Breadboard Labs
- **License:** cc-by-nc-4.0
### Dataset Sources and Attribution
Many people picked up the app and contributed to the data that was collected. Curio was also used to support many great projects and initiatives. We have endeavoured to mention many of those projects below along with the open data tree inventories we uploaded onto the platform.
#### Collaborative projects supported by Curio
- [Morton Arboretum](https://mortonarb.org/) - [Chicago Regional Tree Initiative](https://chicagorti.org/programs/)
- [Dublin City Council’s Parks, Biodiversity and Landscape Services](https://www.dublincity.ie/residential/parks) & [School of Geography at University College Dublin](https://www.ucd.ie/geography) - [Tree Mapping Dublin](https://mappinggreendublin.com/)
- [Sacramento Tree Foundation](https://sactree.org/) - [Save the Elms Program](https://sactree.org/programs/monitoring-elms/)
- [Cambridge City Council](https://www.cambridge.gov.uk/) - [Cambridge City Canopy Programme](https://www.cambridge.gov.uk/cambridge-canopy-project)
- [Municipality of Oslo Agency for Urban Environment](https://www.visitoslo.com/en/product/?tlp=593685) - Inventory and ecosystem services report hosting
- [Friends of Brunswick Park](http://www.friendsofbrunswickpark.co.uk/)
- [Exeter Trees](www.exetertrees.uk)
- [Wembley Park Limited](https://wembleypark.com/)
- [Washington Square Park Eco Projects](https://www.wspecoprojects.org/)
- [Coláiste Bríde Enniscorthy](https://www.colaistebride.ie/)
- [Enniscorthy Vocational College](https://www.enniscorthycc.ie/)
- [Mountshannon Arboretum](https://www.mountshannonarboretum.com/) - Forester Bernard Carey initiated the Mountshannon i-Tree project, in conjunction with UCD and UK-based consultancy Treeconomics.
- [Sidmouth Arboretum](http://sidmoutharboretum.org.uk/)
- [East Devon District Council](https://eastdevon.gov.uk/)
- [SLU](https://www.slu.se/en/) - Alnarp - Skåne Tree Inventory and support for and involvement in the New Commons and Curio Canopy projects
- [Malmö Stad](https://malmo.se/) - Malmö Tree Inventory and support for and involvement in the New Commons and Curio Canopy projects
- [Göteborgs Stad](https://goteborg.se/) -
- [Halmstad](https://www.halmstad.se/)
- [Hvilan](https://www.hvilanutbildning.se/)
- [Familjebostader](https://familjebostader.com/om-oss/)
#### Open Data Sources Attribution
- The Greater London Authority Datastore - [Local Authority Maintained Trees](https://data.london.gov.uk/dataset/local-authority-maintained-trees)
- NYC OpenData - [2015 Street Tree Census - Tree Data](https://data.cityofnewyork.us/Environment/2015-Street-Tree-Census-Tree-Data/uvpi-gqnh)
- Open Data BDN - [Street trees of the city of Barcelona](https://opendata-ajuntament.barcelona.cat/data/dataset/arbrat-viari)
- Open Data Bristol - [Trees](https://opendata.bristol.gov.uk/datasets/7a99218a4bf347ff948f0e5882406a8c)
- Open Data NI - [Belfast City Trees](https://admin.opendatani.gov.uk/dataset/belfast-trees)
- Denver Open data - [Tree Inventory](https://denvergov.org/opendata/dataset/city-and-county-of-denver-tree-inventory)
- Open Data DK - [City of Copenhagen Trees](https://www.opendata.dk/city-of-copenhagen/trae-basis-kommunale-traeer)
- Palo Alto Open Data - [Palo Alto Trees](https://data.cityofpaloalto.org/dataviews/73226/palo-alto-trees/)
- Fingal County Council Open Data - [Fingal County Council Trees](https://data.fingal.ie/maps/1e5f9db62e53443d946c15a1a06fd98b_0/explore)
- Data SA - [City of Adelaide Street Trees](https://data.sa.gov.au/data/dataset/street-trees)
- Open Data Boulder Colorado - [Tree Inventory Open Data](https://open-data.bouldercolorado.gov/datasets/dbbae8bdb0a44d17934243b88e85ef2b)
- Biodiversity Ireland - [Hertitage Trees Ireland](https://maps.biodiversityireland.ie/Dataset/27)
- Birmingham City Council Trees
## Uses
<!-- Address questions around how the dataset is intended to be used. -->
The data is free to be used for research purposes subject to the cc-by-nc-4.0 licence and suitable attribution, please see the citation section below
Some potential uses might include;
- Investigations into urban tree biodiversity.
- The development of algorithms for extracting tree attributes via photos or streetview imagery.
- A tree species detection app.
- The detection trees of via satellite imagery.
- Species identfiication via hyperspectral tree.
It worth noting that for most use-cases cleaning, analysis and processing of data will be necessary. The completeness of tree inventory data varies greatly and users were not directed in anyway in terms of how to frame the photos they took and uploaded via the Curio app.
## Dataset Structure
<!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. -->
### TaggedTrees
Number of data points: 2,593,139
The details of an individual tree including its location, species, diameter at breast height (dbh), vitality etc. when available
### Images
Number of data points: 27,288
The details of images that were uploaded to the platform. The path to the actual image uploaded, this can be found in uploads directory. The details of what the image was attached to which usually was a ‘Story” that was then attached to a tree are also included.
### Uploads:
The set of images referenced in the images data file. The set of images was quite large even when zipped and so was broken up into 10gb chunks. Download each of the chunks and then run unzip on the uploads.zip file
A folder containing downsized versions of the images based on a fixed width has also been included - resized-uploads-width1200.zip
### Stories:
The details of a story that was attached to tree
### Notes:
The text included in a story/note about a tree.
### Conversations & Comments:
Comments grouped by conversations linked to a particular Story
### TreeSpecies
The tree species dictionary we built to support the platform. Each TaggedTree has a tree_species_id that references an entry in this dictionary when populated.
### TreeSpeciesAliases
The local names across multiple languages that can used to describe a species of tree contained in the TreeSpecies dictionary
### Tags and Taggings
Trees could be tagged with details such as diseased, monitored, newly_planted, apples, overhead cables etc. Anything at all really that could later be used to filter, group or identify trees of interest as well describe their state.
## Dataset Creation
### Curation Rationale
<!-- Motivation for the creation of this dataset. -->
The goal of the Curio platform was to educate, engage and democratised access to environmenatal information. Making the data collected on the platform available in this form is seen as an extension of that mission.
#### Data Collection and Processing
<!-- This section describes the data collection and processing process such as data selection criteria, filtering and normalization methods, tools and libraries used, etc. -->
All data was collected via the Curio app by its community. Where inventory data was uploaded in bulk we preprocessed the data to ensure details such as species information where mapped to the species dictionary we deinfed and that has been included in this release.
Before making the data available on this platform we decided to run face detection and blur any obvious, detectable faces found in the images that have been included.
<!-- #### Who are the source data producers? -->
<!-- This section describes the people or systems who originally created the data. It should also include self-reported demographic or identity information for the source data creators if this information is available. -->
<!-- #### Personal and Sensitive Information -->
<!-- State whether the dataset contains data that might be considered personal, sensitive, or private (e.g., data that reveals addresses, uniquely identifiable names or aliases, racial or ethnic origins, sexual orientations, religious beliefs, political opinions, financial or health data, etc.). If efforts were made to anonymize the data, describe the anonymization process. -->
<!-- ## Bias, Risks, and Limitations -->
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
## Citation [optional]
<!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. -->
@misc{CurioTreeData,
title = {The Curio Tree Dataset},
author = {Conor Nugent and Paul Hickey},
year = {2023},
publisher = {HuggingFace},
journal = {HuggingFace repository},
howpublished = {\url{https://https://huggingface.co/datasets/BreadboardLabs/CurioTreeData}},
}
## Dataset Card Authors
Conor Nugent and Paul Hickey
## Dataset Card Contact
[Conor Nugent](https://www.linkedin.com/in/conor-nugent-5b02458/?originalSubdomain=ie)
提供机构:
BreadboardLabs
原始信息汇总
The Curio Tree Dataset
数据集概述
该数据集包含从Curio平台收集的树木清单、图像和故事数据,平台关闭前收集的数据。数据包括:
- 全球各地250万棵树的清单详情(位置、物种、胸径(DBH)、高度、活力等,如有)
- 27,288张由社区上传到平台的树木图像,与单个树木及其物种信息等关联
- 与树木相关的笔记(故事)、标签和对话
数据集描述
Curio是一个专注于城市林业的环境教育和外展平台,连接城市林业管理的各个利益相关者与公众,并通过其网站和移动应用程序上传的所有数据公开可用。该平台从2016年3月运营至2023年8月。
数据集内容
- TaggedTrees: 2,593,139条数据,包含单个树木的详细信息,如位置、物种、胸径(DBH)、活力等(如有)
- Images: 27,288条数据,包含上传到平台的图像的详细信息,包括上传路径和关联的故事
- Uploads: 引用在图像数据文件中的图像集合,分为10GB的压缩块
- Stories: 与树木关联的故事的详细信息
- Notes: 关于树木的故事/笔记中的文本
- Conversations & Comments: 与特定故事关联的按对话分组的评论
- TreeSpecies: 支持平台的树木物种字典,每个TaggedTree都有一个tree_species_id引用此字典中的条目(如有)
- TreeSpeciesAliases: 树木物种字典中包含的多种语言的本地名称
- Tags and Taggings: 树木可以标记的详细信息,如疾病、监测、新种植、苹果、架空电缆等
数据集创建
数据收集和处理
所有数据通过Curio应用程序由社区收集。批量上传的清单数据经过预处理,确保物种信息映射到定义的物种字典,并在发布前对图像进行面部检测和模糊处理。
使用许可
数据可免费用于研究目的,需遵守cc-by-nc-4.0许可和适当归属。
引用
@misc{CurioTreeData, title = {The Curio Tree Dataset}, author = {Conor Nugent and Paul Hickey}, year = {2023}, publisher = {HuggingFace}, journal = {HuggingFace repository}, howpublished = {url{https://https://huggingface.co/datasets/BreadboardLabs/CurioTreeData}}, }
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



