A dataset to measure China biodiversity risk
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/ny5x3bkd56
下载链接
链接失效反馈官方服务:
资源简介:
Extinctions of biological populations are becoming more frequent and have important implications for related sectors. As a result, the risks associated with biodiversity have received increasing attention and are considered to be entirely new risk factors. To understand the drivers of biodiversity risk, it is crucial to measure biodiversity risk at multiple levels, especially in developing countries. From perspectives of macro-government, meso-industry, and micro-companies, we use machine learning and text mining methods to measure the biodiversity risk of the Chinese market from 2000 to 2023, by using official media news texts, related fund holding data, and listed companies’ annual report texts. Specifically, our data features a measure of biodiversity risk in each of the three dimensions. Unlike previous biodiversity risk measurements, our data can reflect China's biodiversity risk from multiple perspectives, including macro-government, meso-industry, and micro-firms. Also our biodiversity risk data can be clustered on categorical domains such as time, city, and industry. As a result, our data can be matched with most relevant studies. Our biodiversity risk macro-data comes from the news data of Chinese mainstream media between 2013 and 2023, and we adopt a machine learning approach to text mining to obtain the biodiversity risk of 5,394 trading days. Our biodiversity risk meso-data comes from more than 40 funds related to conceptual themes such as ‘bioprotection’ listed between 2015 and 2023. Our micro-biodiversity risk indicators are extracted from the annual reports of 5,606 listed firms listed on the Shanghai Stock Exchange, Shenzhen Stock Exchange and Beijing Stock Exchange from 2000 to 2023.
生物种群灭绝事件愈发频发,对相关产业产生了重要影响。由此,生物多样性相关风险受到了日益广泛的关注,并被视作全新的风险类型。为厘清生物多样性风险的驱动因素,多维度测度生物多样性风险至关重要,这一点在发展中国家尤为突出。本研究从宏观政府、中观产业、微观企业三个维度出发,借助机器学习(machine learning)与文本挖掘(text mining)方法,依托官方媒体新闻文本、相关基金持仓数据以及上市公司年报文本,测度了2000年至2023年中国市场的生物多样性风险。具体而言,本数据集覆盖上述三个维度的生物多样性风险测度指标。与既往的生物多样性风险测度成果不同,本数据集能够从宏观政府、中观产业、微观企业多视角展现中国的生物多样性风险;同时,该数据集可按照时间、城市、行业等分类维度进行聚类,因此可与绝大多数相关研究实现匹配对接。本数据集的宏观生物多样性风险数据来源于2013年至2023年中国主流媒体的新闻文本,通过机器学习结合文本挖掘方法,我们得到了5394个交易日的生物多样性风险测度值。中观生物多样性风险数据则取自2015年至2023年期间上市的40余只以“生物保护”为概念主题的相关基金。微观生物多样性风险指标则提取自2000年至2023年间,在上海证券交易所(Shanghai Stock Exchange)、深圳证券交易所(Shenzhen Stock Exchange)及北京证券交易所(Beijing Stock Exchange)上市的5606家上市公司的年报文本。
创建时间:
2024-12-04



