Xcissa/climate-codex
收藏Hugging Face2024-11-14 更新2025-11-03 收录
下载链接:
https://hf-mirror.com/datasets/Xcissa/climate-codex
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-classification
- feature-extraction
language:
- en
tags:
- climate-tech
- clean-tech
- sustainability
- fundraisers
- investment
- finance
- startups
- environmental-analysis
- economic-analysis
pretty_name: 'Climate Codex: Climate Technology Fundraisers Dataset'
size_categories:
- 1K<n<10K
---
## Climate Codex: Climate Technology Fundraisers Dataset
Climate Codex is a dataset that compiles climate technology fundraiser data from March 2020 to February 5, 2024. The data has been collected from various newsletters and blogs, including CTVC by Sightline Climate, Keep Cool, and other potential sources in the future, such as Climatebase.org and Bloomberg Green. This dataset captures key investments, innovative startups, and significant fundraising events that aim to address pressing environmental challenges.
### Sources
- [CTVC by Sightline Climate](https://www.ctvc.co/tag/newsletter/)
- [Keep Cool](https://keepcool.co/)
- Potential future sources include:
- [Climatebase.org](https://climatebase.org/stories)
- [Bloomberg Green](https://www.bloomberg.com/green)
- [4WARD.VC Climate Tech Newsletter Database](https://4ward.vc/newsletterdb/)
### Methodology
The dataset was prepared by scraping and cleaning data from the sources listed above, using tools such as Selenium and GPT-4 for data extraction and normalization. Currency values were converted to USD, dates were standardized, and locations were geocoded to enhance analysis. The final dataset includes columns for fundraising entity, amount raised, fundraising date, description, sector classification, location, and links to original announcements.
### Data Fields
- **Fundraising Entity:** The organization or company raising funds.
- **Amount Raised:** Amount of funds raised, normalized to USD.
- **Date of Funding Reported:** Date when the funding event was published or reported.
- **Description:** Brief description of the fundraiser or the organization.
- **Sector:** Climate technology sector (e.g., Built Environment, Carbon Technology, Energy, etc.).
- **Location:** Headquarters or main address of the fundraising entity.
- **Telephone:** Contact number, if available.
- **Links:** Links to the announcement and source newsletters.
### Usage
This dataset is ideal for analysis in the fields of climate technology investment, environmental sustainability, startup funding, and economic analysis related to green technologies.
### License
The dataset is available under the Apache License 2.0.
## Dataset Summary
The Climate Codex dataset compiles a snapshot of climate technology fundraisers, tracking investments in green technology sectors over the past few years. The dataset includes details on fundraising events, amount raised, the organization involved, geographic location, and the climate tech sector of each investment. By aggregating data from prominent climate-focused newsletters, this dataset provides insights into the evolving landscape of climate technology investment.
## Considerations for Using the Data
### Limitations
- Data sources are limited to specific newsletters, so the dataset may not represent the entire landscape of climate technology fundraisers.
- Currency conversions and geolocation data may not be fully accurate due to rounding and approximations.
### Ethical Considerations
- This dataset does not include personally identifiable information (PII) of individual investors or private entities but may contain company information publicly available online.
### Future Work
- Expanding the dataset to include more data sources and improve the granularity of sector categorization.
## How to Cite
If you use this dataset in your research, please cite it as follows:
Climate Codex: Climate Technology Fundraisers Dataset (2024). Compiled by Xcissa Innovations. Available at: https://huggingface.co/datasets/Xcissa/climate-codex
license: Apache-2.0协议
任务类别:
- 文本分类
- 特征提取
语言:
- 英语
标签:
- 气候科技(climate-tech)
- 清洁科技(clean-tech)
- 可持续发展(sustainability)
- 募资(fundraisers)
- 投资(investment)
- 金融(finance)
- 初创企业(startups)
- 环境分析(environmental-analysis)
- 经济分析(economic-analysis)
pretty_name: '气候法典(Climate Codex):气候科技募资数据集'
size_categories:
- 1K<n<10K
## 气候法典(Climate Codex):气候科技募资数据集
气候法典是一个汇编了2020年3月至2024年2月5日期间气候科技募资数据的数据集。数据采集自多个时事通讯与博客,包括Sightline Climate旗下的CTVC、Keep Cool,未来还计划纳入Climatebase.org、Bloomberg Green等更多来源。本数据集收录了旨在应对紧迫环境挑战的关键投资、创新初创企业与重大募资事件。
### 数据来源
- [Sightline Climate旗下CTVC]("https://www.ctvc.co/tag/newsletter/")
- [Keep Cool]("https://keepcool.co/")
- 未来潜在新增来源包括:
- [Climatebase.org]("https://climatebase.org/stories")
- [Bloomberg Green]("https://www.bloomberg.com/green")
- [4WARD.VC气候科技时事通讯数据库]("https://4ward.vc/newsletterdb/")
### 数据制备方法
本数据集通过对上述来源的数据进行爬取与清洗制备,使用Selenium与GPT-4等工具完成数据提取与标准化流程。其中货币金额已转换为美元,日期格式统一,地理位置信息经过地理编码以优化分析效果。最终数据集包含以下字段:
- **募资主体:** 发起募资的机构或企业
- **募资总额:** 已统一标准化为美元的募资金额
- **融资事件报道日期:** 融资事件公开发布或报道的日期
- **项目描述:** 募资项目或相关组织的简要介绍
- **细分赛道:** 所属气候科技赛道(例如建筑环境、碳技术、能源等)
- **地理位置:** 募资主体的总部或主要办公地址
- **联系电话:** 如可获取则提供的联系号码
- **链接:** 原始公告及来源时事通讯的相关链接
### 使用场景
本数据集适用于气候科技投资、环境可持续发展、初创企业融资以及绿色技术相关经济分析等领域的研究与分析工作。
### 开源协议
本数据集采用Apache License 2.0协议进行开源分发。
## 数据集概览
气候法典数据集收录了近年绿色科技赛道的募资快照,追踪了各绿色技术领域的投资情况。数据集包含募资事件详情、募资总额、参与机构、地理位置以及每笔投资所属的气候科技赛道信息。通过整合多家知名气候主题时事通讯的数据,本数据集为洞察气候科技投资的动态演变格局提供了可靠的数据支撑。
## 数据使用注意事项
### 局限性说明
- 本数据集仅覆盖特定时事通讯来源,因此可能无法完整反映全球气候科技募资的全貌;
- 货币转换与地理编码数据可能因四舍五入与近似处理存在一定误差。
### 伦理考量
本数据集未包含个人投资者或私有实体的个人可识别信息(PII),但可能包含公开可获取的企业相关信息。
### 未来工作规划
未来将拓展数据集的来源范围,并细化行业分类的颗粒度,以提升数据集的覆盖范围与分析价值。
## 引用方式
若在研究中使用本数据集,请按以下格式进行引用:
气候法典(Climate Codex):气候科技募资数据集(2024)。由Xcissa Innovations汇编。可于以下网址获取:https://huggingface.co/datasets/Xcissa/climate-codex
提供机构:
Xcissa



