five

U.S. Community Water Systems Service Boundaries, v2.0.0

收藏
DataONE2022-07-05 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:e5524c93346cc0e3fe5489e2a471b9911abce0810b3ae654a98319087aa94df6
下载链接
链接失效反馈
官方服务:
资源简介:
This is a layer of water service boundaries for 44,786 community water systems that deliver tap water to 307.1 million people in the US. This amounts to 97% of the population reportedly served by active community water systems and 91% of active community water systems. The layer is based on multiple data sources and a methodology developed by SimpleLab and collaborators called a Tiered, Explicit, Match, and Model approach–or TEMM, for short. The name of the approach reflects exactly how the nationwide data layer was developed. The TEMM is composed of three hierarchical tiers, arranged by data and model fidelity. First, we use explicit water service boundaries provided by states. These are spatial polygon data, typically provided at the state-level. We call systems with explicit boundaries Tier 1. In the absence of explicit water service boundary data, we use a matching algorithm to match water systems to the boundary of a town or city (Census Place TIGER polygons). When a water system and TIGER place match one-to-one, we label this Tier 2a. When multiple water systems match to the same TIGER place, we label this Tier 2b. In v1.0.0, Tier 2b reflects overlapping boundaries for multiple systems. In v2.0.0 Tier 2b is removed through a \"best match\" algorithm that assigns one water system to one TIGER place. Finally, in the absence of an explicit water service boundary (Tier 1) or a TIGER place polygon match (Tier 2a), a statistical model trained on explicit water service boundary data (Tier 1) is used to estimate a reasonable radius at provided water system centroids, and model a spherical water system boundary (Tier 3). Several limitations to this data exist–and the layer should be used with these in mind. The case of assigning a Census Place TIGER polygon to the \"best match\" water system in v2.0.0 requires further validation. Many systems were then assigned to Tier 3. Tier 3 boundaries have modeled radii stemming from a lat/long centroid of a water system facility; but the underlying lat/long centroids for water system facilities are of variable quality. It is critical to evaluate the \"geometry quality\" column (included from the EPA ECHO data source) when looking at Tier 3 boundaries; fidelity is very low when geometry quality is a county or state centroid– but we did not exclude the data from the layer. Future iterations plan to improve upon geometry quality for modeled systems. Missing water systems are typically those without a centroid, in a U.S. territory, or missing population and connection data. Finally, Tier 1 systems are assumed to be high fidelity, but rely on the accuracy of state data collection and maintenance. All data, methods, documentation, and contributions are open-source and available here: https://github.com/SimpleLab-Inc/wsb.
创建时间:
2023-12-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作