five

摩天轮运营数据资产开发及应用研究数据集

收藏
天津市数据知识产权登记平台2024-12-16 更新2024-12-27 收录
下载链接:
https://dengji.tjippc.cn/xxgg_nr?id=c163e586-7876-492d-95ae-b2806a081c43
下载链接
链接失效反馈
官方服务:
资源简介:
一、首先基于"天津之眼"摩天轮票务系统,建立数据提取规则,选取2020年7月至2024年10月的ticket表数据作为基础数据,选取8个核心字段进行保留,并对其进行逐层过滤清洗:1.清洗测试数据、字段缺失数据等异常数据;2.基于ticket_no编码规则进行纠正、清洗;3.基于身份证号规范性进行校验、清洗,具体包括号码长度(18位)、地区码范围(110000-659900)、出生日期范围(19000101-20241231)、顺序码范围(001-999)、校验码范围(0-9及X),对于使用非身份证号(护照号、港澳通行证、台胞证、永久居留证等)登记的游客,根据相应编码规范进行校验、清洗;4.对敏感个人信息(身份证号等证件号、姓名)进行脱敏;5.对购票时间、游玩日期、检票时间取值进行校验、清洗;6.将清洗后的ticket表和数据库内其他具有业务逻辑关系的表(customer表、ticket_order表等)及摩天轮运营时间表进行交叉验证,确保数据可用性。 二、基于清洗后的ticket表数据,分别基于游客地区分布(身份证号地区码)、年龄分布(12岁以下、[12,18)、[18,24)、[24,30)、[30,36)、[36,42)、[42,48)、[48,56)、56岁以上)、性别分布(男、女)、游玩时间段分布(12:00之前、[12:00,15:00)、[15:00,18:00)、18:00及之后)及每天总客流量进行分析,进行分类存储,形成分析子表,协同ticket表一起形成数据集,最终形成摩天轮项目数据质量评估报告及数据治理优化方案建议报告,辅助运营决策。

1. Firstly, establish data extraction protocols based on the ticketing system of the Tianjin Eye Ferris Wheel, select data from the `ticket` table spanning July 2020 to October 2024 as the foundational dataset, retain 8 core fields, and perform layer-wise filtering and data cleansing: 1.1 Remove abnormal data such as test records and entries with missing fields; 1.2 Correct and cleanse data in accordance with the coding rules of the `ticket_no` field; 1.3 Verify and standardize resident ID card numbers, including checking the length (18 digits), regional code range (110000-659900), date of birth range (19000101-20241231), sequence code range (001-999), and check digit range (0-9 and X). For tourists registered with non-ID identification documents (such as passports, Hong Kong-Macau Exit-Entry Permits, Mainland Travel Permits for Taiwan Residents, Permanent Residence Permits, etc.), conduct corresponding verification and cleansing per their respective coding specifications; 1.4 Desensitize sensitive personal information, including resident ID card numbers and other identification document numbers, as well as personal names; 1.5 Verify and cleanse the values of ticket purchase timestamps, visit dates, and ticket inspection timestamps; 1.6 Cross-validate the cleansed `ticket` table with other business-logically related tables in the database (such as the `customer` table, `ticket_order` table, etc.) and the Ferris wheel operation schedule to ensure data usability and validity. 2. Secondly, based on the cleansed `ticket` table data, conduct multi-dimensional analysis along the dimensions of tourist regional distribution (derived from ID card regional codes), age distribution (under 12 years old, [12, 18), [18, 24), [24, 30), [30, 36), [36, 42), [42, 48), [48, 56), and over 56 years old), gender distribution (male, female), visit time slot distribution (before 12:00, [12:00, 15:00), [15:00, 18:00), and after 18:00), and daily total passenger flow. Store the analysis results in categorized sub-tables, integrate these sub-tables with the `ticket` table to form the complete dataset, and ultimately generate two reports: a data quality assessment report for the Tianjin Eye Ferris Wheel project and a proposal report for data governance optimization, to provide support for operational decision-making.
提供机构:
天津城投集团资产管理有限公司
创建时间:
2024-12-16
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务