CSR Report Sentences and Additional Datasets
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/kpd3d3gv45
下载链接
链接失效反馈官方服务:
资源简介:
Corporate Social Responsibility (CSR) Dataset: This dataset comprises 21,782 distinct sentences extracted from sustainability reports of 100 publicly listed companies across 26 diverse industry sectors. The data was collected from 2024 CSR disclosure documents, with comprehensive metadata including source identifiers, company information, industry sector, and page numbers to enable thematic pattern analysis of corporate sustainability communication strategies.
Yahoo Answers Dataset: This community-driven dataset contains question-answer pairs from two knowledge-intensive domains - "Science & Mathematics" and "Computers & Internet" categories. The researchers extracted 2,000 balanced instances from the original dataset of 140,000 training and 6,000 testing samples per category, focusing on content with overlapping vocabularies but distinct conceptual boundaries relevant to technical support and knowledge management systems.
IMDB Movie Review Dataset: This sentiment analysis dataset consists of 50,000 movie reviews equally distributed between positive and negative sentiment categories, from which 2,000 instances were extracted through balanced stratified sampling. The dataset represents emotionally-charged textual content with colloquial language patterns, designed to validate opinion mining applications requiring sophisticated analysis of subjective narrative text containing complex emotional nuances and evaluative language.
创建时间:
2025-06-24



