five

aatrocy/fortune500-esg-metrics-2021-2023

收藏
Hugging Face2025-12-08 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/aatrocy/fortune500-esg-metrics-2021-2023
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - tabular-classification - tabular-regression - time-series-forecasting language: - en tags: - esg - sustainability - climate - finance - corporate-governance - environmental - social-responsibility - fortune-500 - carbon-emissions - renewable-energy pretty_name: Fortune 500 ESG Metrics Dataset (2021-2023) size_categories: - 1M<n<10M dataset_info: features: - name: name dtype: string - name: year dtype: int64 - name: metric_name dtype: string - name: value dtype: string - name: units dtype: string - name: additional_notes dtype: string splits: - name: train num_bytes: 1210000000 num_examples: 500000 download_size: 1130000000 dataset_size: 1210000000 configs: - config_name: default data_files: - split: train path: Fortune500_ESG_Metrics_2021-2023.csv --- # Fortune 500 ESG Metrics Dataset (2021-2023) <div align="center"> ![ESG](https://img.shields.io/badge/ESG-Environmental%20Social%20Governance-green) ![Companies](https://img.shields.io/badge/Companies-500%2B-blue) ![Years](https://img.shields.io/badge/Years-2021--2023-orange) ![License](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey) </div> ## 🌍 Dataset Description This comprehensive dataset contains Environmental, Social, and Governance (ESG) metrics from Fortune 500 companies spanning 2021-2023. It represents one of the most extensive collections of corporate sustainability data publicly available, compiled from official corporate reports, sustainability disclosures, and ESG filings. ### 🎯 Key Features - **📊 Extensive Coverage**: Fortune 500 companies - **📅 Multi-Year Data**: Complete data for 2021, 2022, and 2023 - **🔍 Detailed Metrics**: Hundreds of ESG indicators per company - **📏 Standardized Format**: Consistent structure across all companies - **📝 Rich Metadata**: Includes units and additional notes for context ## 📁 Dataset Structure ### Schema | Column | Type | Description | |--------|------|-------------| | `name` | string | The specific metric or indicator name as reported | | `year` | integer | Reporting year (2021, 2022, or 2023) | | `metric_name` | string | Standardized metric identifier for cross-company comparison | | `value` | string | The reported value (numeric or categorical) | | `units` | string | Unit of measurement (e.g., MWh, tCO2e, %, count) | | `additional_notes` | string | Additional context, methodology notes, or clarifications | ### 📊 Data Sample ```json { "name": "Total Energy Consumption", "year": 2021, "metric_name": "energy_consumption_total", "value": "1234567", "units": "MWh", "additional_notes": "Includes all global facilities" } ``` ## 🏢 Companies Included The dataset covers major corporations across various industries: ### Technology - Apple, Microsoft, Google, Amazon, Meta, IBM, Oracle, Salesforce ### Financial Services - JPMorgan Chase, Bank of America, Wells Fargo, Goldman Sachs, Morgan Stanley ### Healthcare & Pharmaceuticals - Johnson & Johnson, Pfizer, Abbott Laboratories, Merck, CVS Health ### Consumer Goods - Walmart, Target, Procter & Gamble, Coca-Cola, PepsiCo ### Energy & Utilities - ExxonMobil, Chevron, NextEra Energy, Duke Energy ### Manufacturing & Industrial - General Electric, Boeing, Caterpillar, 3M, Honeywell ### And 450+ more Fortune 500 companies... ## 📈 Metrics Categories ### 🌱 Environmental Metrics - **Energy**: Consumption, renewable energy usage, energy intensity - **Emissions**: Scope 1, 2, and 3 GHG emissions, emission reduction targets - **Water**: Usage, recycling, conservation efforts - **Waste**: Generation, recycling rates, hazardous waste management - **Biodiversity**: Land use, conservation initiatives ### 👥 Social Metrics - **Workforce**: Diversity statistics, employee turnover, training hours - **Safety**: Injury rates, safety incidents, health programs - **Community**: Investment, volunteer hours, local hiring - **Supply Chain**: Supplier diversity, audits, labor practices ### 🏛️ Governance Metrics - **Board**: Composition, diversity, independence - **Ethics**: Code of conduct violations, whistleblower reports - **Risk Management**: ESG risk assessment, climate risk disclosure - **Transparency**: Reporting standards, external verification ## 🚀 Usage Examples ### Loading the Dataset ```python import pandas as pd from datasets import load_dataset # Method 1: Using Hugging Face datasets library dataset = load_dataset("GemiAI2025/fortune500-esg-metrics-2021-2023") df = pd.DataFrame(dataset['train']) # Method 2: Direct download df = pd.read_csv("Fortune500_ESG_Metrics_2021-2023.csv") ``` ### Basic Analysis ```python # View companies in dataset companies = df['name'].str.extract(r'(.+?)_\d{4}')[0].unique() print(f"Total companies: {len(companies)}") # Analyze emissions data emissions_data = df[df['metric_name'].str.contains('emission', case=False)] avg_emissions = emissions_data.groupby('year')['value'].mean() # Track renewable energy adoption renewable_energy = df[df['metric_name'].str.contains('renewable', case=False)] renewable_trend = renewable_energy.groupby(['year'])['value'].mean() ``` ### Machine Learning Applications ```python # Prepare data for ESG score prediction from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler # Feature engineering for ML models pivot_data = df.pivot_table( index=['company', 'year'], columns='metric_name', values='value' ) # Use for sustainability prediction models X_train, X_test, y_train, y_test = train_test_split( features, targets, test_size=0.2, random_state=42 ) ``` ## 🎯 Use Cases ### 📊 Research & Analysis - Academic research on corporate sustainability - ESG performance benchmarking - Sector-specific sustainability analysis - Time-series analysis of ESG improvements ### 🤖 Machine Learning - ESG score prediction models - Sustainability risk assessment - Anomaly detection in reporting - Predictive analytics for future targets ### 💼 Business Applications - Investment screening and due diligence - Competitive analysis - Supply chain sustainability assessment - Regulatory compliance monitoring ### 📚 Educational - Case studies for business schools - Data science projects - Sustainability course materials - Research datasets for thesis work ## 📋 Data Collection Methodology 1. **Source Documents**: Data extracted from: - Annual Sustainability Reports - CDP (Carbon Disclosure Project) submissions - GRI (Global Reporting Initiative) reports - SEC ESG disclosures - Corporate integrated reports 2. **Standardization Process**: - Metric names standardized across companies - Units converted to common standards where possible - Temporal alignment for year-over-year comparison 3. **Quality Assurance**: - Cross-validation with multiple sources - Outlier detection and verification - Completeness checks ## ⚠️ Important Considerations ### Data Limitations - **Reporting Standards**: Companies may use different methodologies - **Coverage Gaps**: Not all companies report all metrics - **Temporal Differences**: Fiscal years may vary between companies - **Voluntary Disclosure**: Some metrics are not mandatory ### Recommended Preprocessing ```python # Handle missing values appropriately df['value'] = pd.to_numeric(df['value'], errors='coerce') # Standardize company names df['company'] = df['name'].str.extract(r'(.+?)_\d{4}')[0] # Create year-over-year change metrics df['yoy_change'] = df.groupby(['company', 'metric_name'])['value'].pct_change() ``` ## 📖 Citation If you use this dataset in your research or applications, please cite: ```bibtex @dataset{fortune500_esg_metrics_2023, title = {Fortune 500 ESG Metrics Dataset (2021-2023)}, author = {GemiAI2025}, year = {2023}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/GemiAI2025/fortune500-esg-metrics-2021-2023} } ``` ## 📜 License This dataset is released under the [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/). You are free to: - **Share**: Copy and redistribute the material in any medium or format - **Adapt**: Remix, transform, and build upon the material for any purpose, even commercially ## 🤝 Contributing We welcome contributions to improve and expand this dataset: - Report issues or inconsistencies - Suggest additional metrics or companies - Share derivative datasets or analyses ## 📞 Contact - **Dataset Curator**: GemiAI2025 - **Hugging Face Profile**: [@GemiAI2025](https://huggingface.co/GemiAI2025) - **Issues**: Please use the [discussion tab](https://huggingface.co/datasets/GemiAI2025/fortune500-esg-metrics-2021-2023/discussions) ## 🙏 Acknowledgments This dataset compilation was made possible through the transparency efforts of Fortune 500 companies and their commitment to ESG disclosure. Special thanks to the open data community for inspiration and support. --- <div align="center"> Made with 💚 for the sustainability and data science community </div>
提供机构:
aatrocy
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作