five

Ci Technology DataSet

收藏
DataONE2023-05-22 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:2291cc19ed4e348a344f58f656cf5b354bfd2b8a0a05d59b2799d9333ce795f4
下载链接
链接失效反馈
官方服务:
资源简介:
Originally published by Harte-Hanks, the CiTDS dataset is now produced by Aberdeen Group, a subsidiary of Spiceworks Ziff Davis (SWZD). It is also referred to as CiTDB (Computer Intelligence Technology Database). CiTDS provides data on digital investments of businesses across the globe. It includes two types of technology datasets: (i) hardware expenditures and (ii) product installs. Hardware expenditure data is constructed through a combination of surveys and modeling. A survey is administered to a number of companies and the data from surveys is used to develop a prediction model of expenditures as a function of firm characteristics. CiTDS uses this model to predict the expenditures of non-surveyed firms and reports them in the dataset. In contrast, CiTDS does not do any imputation for product install data, which comes entirely from web scraping and surveys. A confidence score between 1-3 is assigned to indicate how much the source of information can be trusted. A 3 corresponds to 90-100 percent install likelihood, 2 corresponds to 75-90 percent install likelihood and 1 corresponds to 65-75 percent install likelihood. CiTDS reports technology adoption at the site level with a unique DUNS identifier. One of these sites is identified as an “enterprise,” corresponding to the firm that owns the sites. Therefore, it is possible to analyze technology adoption both at the site (establishment) and enterprise (firm) levels. CiTDS sources the site population from Dun and Bradstreet every year and drops sites that are not relevant to their clients. Due to this sample selection, there is quite a bit of variation in the number of sites from year to year, where on average, 10-15 percent of sites enter and exit every year in the US data. This number is higher in the EU data. We observe similar turnover year-to-year in the products included in the dataset. Some products have become absolute, and some new products are added every year. There are two versions of the data: (i) version 3, which covers 2016-2020, and (ii) version 4, which covers 2020-2021. The quality of version 4 is significantly better regarding the information included about the technology products. In version 3, product categories have missing values, and they are abbreviated in a way that are sometimes difficult to interpret. Version 4 does not have any major issues. Since both versions of the data are available in 2020, CiTDS provides a crosswalk between the versions. This makes it possible to use information about products in Version 4 for the products in Version 3, with the caveats that there will be no crosswalk for the products that exist in 2016-2019 but not in 2020. Finally, special attention should be paid to data from 2016, where the coverage is significantly different from 2017. From 2017 onwards, coverage is more consistent. Years of Coverage: APac: 2019 - 2021 Canada: 2015 - 2021 EMEA: 2019 - 2021 Europe: 2015 - 2018 Latin America: 2015, 2019- 2021 United States: 2015 - 2021

本CiTDS数据集最初由哈尔特-汉克斯(Harte-Hanks)发布,现由斯派克斯沃茨·齐夫·戴维斯(Spiceworks Ziff Davis, SWZD)旗下的阿伯丁集团(Aberdeen Group)负责制作。该数据集亦被称为CiTDB(Computer Intelligence Technology Database,计算机智能技术数据库)。CiTDS可提供全球范围内企业的数字化投入相关数据,包含两类技术数据集:其一为硬件支出数据,其二为产品部署数据。 硬件支出数据通过调研与建模相结合的方式构建:研究团队会对多家企业开展调研,并利用调研所得数据构建以企业特征为自变量的支出预测模型,随后借助该模型预测未参与调研企业的支出情况,并将预测结果纳入数据集。与之相对,产品部署数据未进行任何插补处理,其数据全部来源于网络抓取与调研。 数据集会为每条数据分配1至3分的置信度评分,用以标识信息来源的可靠程度:评分3代表部署概率为90%-100%,评分2代表部署概率为75%-90%,评分1代表部署概率为65%-75%。 CiTDS以分支机构为单位记录技术采用情况,并为每个分支机构分配唯一的邓白氏编码(DUNS identifier)。其中一个分支机构会被标记为"企业总部",对应拥有该批分支机构的母公司,因此可分别从分支机构(establishment)与企业(firm)两个维度开展技术采用情况分析。 CiTDS每年从邓白氏集团(Dun and Bradstreet)获取分支机构样本池,并剔除与客户需求不相关的分支机构。受此样本筛选规则影响,各年度的分支机构数量存在较大波动:以美国数据集为例,每年平均有10%-15%的分支机构进入或退出样本池,欧盟数据集的该数值更高。数据集收录的产品同样存在逐年更替现象,部分产品已停止收录,且每年均会新增部分产品。 该数据集包含两个版本:其一为版本3,覆盖2016年至2020年;其二为版本4,覆盖2020年至2021年。版本4在技术产品相关信息的收录质量上显著优于版本3:版本3中存在产品类别字段缺失的情况,且部分类别缩写难以解读,而版本4则不存在此类重大问题。由于两个版本的数据集均覆盖2020年,CiTDS提供了两个版本之间的映射对照表,这使得用户可将版本4中的产品信息适配至版本3当中,但需注意:对于2016年至2019年存在但2020年未收录的产品,二者之间不存在映射关系。 最后,需特别关注2016年的数据集:其样本覆盖范围与2017年及之后的版本存在显著差异,2017年起,数据集的样本覆盖范围趋于一致。 各区域覆盖年限如下: 亚太地区(APac):2019年-2021年 加拿大:2015年-2021年 欧洲、中东与非洲地区(EMEA):2019年-2021年 欧洲:2015年-2018年 拉丁美洲:2015年、2019年-2021年 美国:2015年-2021年
创建时间:
2024-02-27
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作