five

Australian Employee Salary/Wages DATAbase by detailed occupation, location and year (2002-14); (plus Sole Traders)

收藏
DataCite Commons2025-06-01 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/dataset/Australia_ATO_Salary_Wages_SoleTrader_Database_by_Occupation_and_Region_2002-14_/4522895/5
下载链接
链接失效反馈
官方服务:
资源简介:
The ATO (Australian Tax Office) made a dataset openly available (see links) showing all the Australian Salary and Wages (2002, 2006, 2010, 2014) by detailed occupation (around 1,000) and over 100 SA4 regions. Sole Trader sales and earnings are also provided. This open data (csv) is now packaged into a database (*.sql) with 45 sample SQL queries (backupSQL[date]_public.txt).<br>See more description at related Figshare #datavis record. <br>Versions:V5: Following #datascience course, I have made main data (individual salary and wages) available as csv and Jupyter Notebook. Checksum matches #dataTotals. In 209,xxx rows.Also provided Jobs, and SA4(Locations) description files as csv. More details at: Where are jobs growing/shrinking? Figshare DOI: 4056282 (linked below). Noted 1% discrepancy ($6B) in 2010 wages total - to follow up.<br>#dataTotals - Salary and WagesYearWorkers (M)Earnings ($B) 20028.5285<br>20069.4372<br>201010.2481<br>201410.3584<br><br><br>#dataTotal - Sole TradersYearWorkers (M)Sales ($B)Earnings ($B)20020.9611320061.0881920101.11122620141.19630<br>#links See ATO request for data at ideascale link below.See original csv open data set (CC-BY) at data.gov.au link below.This database was used to create maps of change in regional employment - see Figshare link below (m9.figshare.4056282).<br>#packageThis file package contains a database (analysing the open data) in SQL package and sample SQL text, interrogating the DB. DB name: test. There are 20 queries relating to Salary and Wages.<br>#analysisThe database was analysed and outputs provided on Nectar(.org.au) resources at: http://118.138.240.130.(offline)This is only resourced for max 1 year, from July 2016, so will expire in June 2017. Hence the filing here. The sample home page is provided here (and pdf), but not all the supporting files, which may be packaged and added later. Until then all files are available at the Nectar URL. Nectar URL now offline - server files attached as package (html_backup[date].zip), including php scripts, html, csv, jpegs.<br>#installIMPORT: DB SQL dump e.g. test_2016-12-20.sql (14.8Mb)1.Started MAMP on OSX.1.1 Go to PhpMyAdmin2. New Database: 3. Import: Choose file: test_2016-12-20.sql -&gt; Go (about 15-20 seconds on MacBookPro 16Gb, 2.3 Ghz i5)4. four tables appeared: jobTitles 3,208 rows | salaryWages 209,697 rows | soleTrader 97,209 rows | stateNames 9 rowsplus views e.g. deltahair, Industrycodes, states5. Run test query under **#; Sum of Salary by SA4 e.g. 101 $4.7B, 102 $6.9B<br>#sampleSQLselect sa4,<br>(select sum(count) from salaryWageswhere year = '2014' and sa4 = sw.sa4) as thisYr14,<br>(select sum(count) from salaryWageswhere year = '2010' and sa4 = sw.sa4) as thisYr10,<br>(select sum(count) from salaryWageswhere year = '2006' and sa4 = sw.sa4) as thisYr06,<br>(select sum(count) from salaryWageswhere year = '2002' and sa4 = sw.sa4) as thisYr02<br>from salaryWages sw<br>group by sa4order by sa4

澳大利亚税务局(Australian Tax Office,ATO)公开发布了一份数据集(相关链接详见文末),涵盖2002、2006、2010、2014四个年份的澳大利亚薪资收入数据,按细分职业(约1000类)与超100个SA4区域进行统计,同时还提供了个体经营者的销售额与盈利情况。这份开源CSV格式数据已被打包为SQL数据库(*.sql),并附带45条示例SQL查询语句(备份文件为backupSQL[date]_public.txt)。 更多描述可参阅相关Figshare #datavis 记录。 版本说明:V5版本:根据#datascience 课程要求,已将核心数据(个人薪资收入)以CSV格式与Jupyter Notebook形式开放,数据总量为209,xxx行,校验和与#dataTotals 一致。同时还提供了职业与SA4区域(地理位置)的CSV格式说明文件。更多细节可参阅《就业增长与收缩趋势》相关Figshare DOI:4056282(链接见下文)。注:2010年薪资总额存在1%的偏差(差额60亿澳元),后续将跟进修正。 #dataTotals - 薪资与工资收入 年份|就业人数(百万)|收入总额(十亿澳元) 2002|8.5|285 2006|9.4|372 2010|10.2|481 2014|10.3|584 #dataTotal - 个体经营者 年份|就业人数(百万)|销售额(十亿澳元)|盈利额(十亿澳元) 2002|0.96|11|3 2006|1.08|81|9 2010|1.11|122|6 2014|1.19|63|0 #links 可在Ideascale平台查阅澳大利亚税务局的数据申请要求。原始开源CSV数据集采用CC-BY许可协议,可在data.gov.au平台获取(链接见下文)。本数据库曾用于制作区域就业变化地图,相关内容可参阅Figshare链接(m9.figshare.4056282)。 #package 本文件包包含针对开源数据进行分析的SQL数据库文件与用于查询该数据库的示例SQL文本。数据库名称为test,其中包含20条针对薪资与工资收入的查询语句。 #analysis 本数据库已完成分析,结果已上传至Nectar(.org.au)平台:http://118.138.240.130(现已离线)。该资源仅保留自2016年7月起的1年期限,将于2017年6月过期,因此将相关文件归档于此。本项目附带了示例主页(含PDF版本),但未包含全部辅助文件,后续可能会打包补充。在此之前,所有文件均可通过Nectar链接获取。目前Nectar链接已失效,服务器文件已作为压缩包(html_backup[date].zip)随本数据包一同提供,其中包含PHP脚本、HTML页面、CSV文件与JPEG图片。 #install 导入数据库SQL转储文件,例如test_2016-12-20.sql(大小14.8Mb): 1. 在macOS系统中启动MAMP 1.1 访问PhpMyAdmin页面 2. 创建新数据库 3. 导入文件:选择test_2016-12-20.sql并执行导入(在16GB内存、2.3GHz i5处理器的MacBook Pro上耗时约15-20秒) 4. 导入完成后将生成4张数据表:jobTitles(3208行)、salaryWages(209697行)、soleTrader(97209行)与stateNames(9行),同时包含deltahair、Industrycodes、states等视图 5. 运行以下测试查询语句 **#; 按SA4区域统计薪资总和:例如101区域为47亿澳元,102区域为69亿澳元 #sampleSQL 示例SQL查询语句: select sa4, (select sum(count) from salaryWages where year = '2014' and sa4 = sw.sa4) as thisYr14, (select sum(count) from salaryWages where year = '2010' and sa4 = sw.sa4) as thisYr10, (select sum(count) from salaryWages where year = '2006' and sa4 = sw.sa4) as thisYr06, (select sum(count) from salaryWages where year = '2002' and sa4 = sw.sa4) as thisYr02 from salaryWages sw group by sa4 order by sa4
提供机构:
figshare
创建时间:
2019-03-04
二维码
社区交流群
二维码
科研交流群
商业服务