Project PMD Immigrants in Chile

Name: Project PMD Immigrants in Chile
Creator: figshare
Published: 2020-09-02 07:54:45
License: 暂无描述

DataCite Commons2020-09-02 更新2024-07-25 收录

下载链接：

https://figshare.com/articles/dataset/Project_PMD_Immigrants_in_Chile/5117152

下载链接

链接失效反馈

官方服务：

资源简介：

Authors: Francisco Pulgar, Daniel Hernández This data files are a basic analysis of the data published by the Departamento de Estranjería y Migración de Chile [1]. Also this data is combined with indicators published by the World Bank Develpment Indicators [2]. [1] http://www.extranjeria.gob.cl/estadisticas-migratorias/ [2] https://www.kaggle.com/worldbank/world-development-indicators The files in this fileset include: in.tar.bz2: the input files to analyze script.py: an Spark Python3 script that analyses the data. immigrantsByCountry.csv: a CSV table with the number of immigrants and the country code. indicatorsJoin.txt.gzip: a file created for debugging. comparisons.csv: a CSV table with the maximum between the numbers in the last two columns, the code of an indicator X, the name of the indicator X, the number of immigrants coming from a country where X is greater than in Chile, and the number of immigrants where X is lesser than in Chile. The sum of the numbers in the last two columns is lesser than the total of immigrants (i.e., 1497519) when there are countries where X is not available. To run the script first decompress the input data in a folder and create another folder to save the output data. You have to edit the script to change the locations of these folders. plot*.png: Plot of indicators normalized by immigrants.

作者：弗朗西斯科·普尔加（Francisco Pulgar）、丹尼尔·埃尔南德斯（Daniel Hernández） 本数据集文件基于智利外国与移民部（Departamento de Estranjería y Migración de Chile）发布的原始数据开展基础统计分析[1]，同时融合了世界银行发展指标（World Bank Development Indicators）发布的各类统计指标[2]。 [1] http://www.extranjeria.gob.cl/estadisticas-migratorias/ [2] https://www.kaggle.com/worldbank/world-development-indicators 本数据集包含以下文件： in.tar.bz2: 待分析的原始输入文件 script.py: 用于开展数据分析的Spark Python 3脚本 immigrantsByCountry.csv: 收录移民人数与国家代码的CSV表格 indicatorsJoin.txt.gzip: 用于调试的中间文件 comparisons.csv: 该CSV表格包含以下字段：最后两列数值的最大值、指标X的代码、指标X的名称、指标X数值高于智利的来源国移民人数，以及指标X数值低于智利的来源国移民人数。当存在指标X数据不可用的国家时，最后两列数值的总和将小于移民总规模（即1497519）。 运行该脚本前，请先将输入数据解压至指定文件夹，并另行创建文件夹用于存储输出结果；需通过编辑脚本文件以修改上述两个文件夹的路径。 plot*.png: 以移民人数为基准进行归一化处理后的指标可视化图表。

提供机构：

figshare

创建时间：

2017-06-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集