five

Health Insurance Cross Sell Prediction 🏠 🏥

收藏
www.kaggle.com2020-09-11 更新2025-01-21 收录
下载链接:
https://www.kaggle.com/anmolkumar/health-insurance-cross-sell-prediction
下载链接
链接失效反馈
官方服务:
资源简介:
### Context Our client is an Insurance company that has provided Health Insurance to its customers now they need your help in building a model to predict whether the policyholders (customers) from past year will also be interested in Vehicle Insurance provided by the company. An insurance policy is an arrangement by which a company undertakes to provide a guarantee of compensation for specified loss, damage, illness, or death in return for the payment of a specified premium. A premium is a sum of money that the customer needs to pay regularly to an insurance company for this guarantee. For example, you may pay a premium of Rs. 5000 each year for a health insurance cover of Rs. 200,000/- so that if, God forbid, you fall ill and need to be hospitalised in that year, the insurance provider company will bear the cost of hospitalisation etc. for upto Rs. 200,000. Now if you are wondering how can company bear such high hospitalisation cost when it charges a premium of only Rs. 5000/-, that is where the concept of probabilities comes in picture. For example, like you, there may be 100 customers who would be paying a premium of Rs. 5000 every year, but only a few of them (say 2-3) would get hospitalised that year and not everyone. This way everyone shares the risk of everyone else. Just like medical insurance, there is vehicle insurance where every year customer needs to pay a premium of certain amount to insurance provider company so that in case of unfortunate accident by the vehicle, the insurance provider company will provide a compensation (called ‘sum assured’) to the customer. Building a model to predict whether a customer would be interested in Vehicle Insurance is extremely helpful for the company because it can then accordingly plan its communication strategy to reach out to those customers and optimise its business model and revenue. Now, in order to predict, whether the customer would be interested in Vehicle insurance, you have information about demographics (gender, age, region code type), Vehicles (Vehicle Age, Damage), Policy (Premium, sourcing channel) etc. ## Data Description - Train Data |Variable|Definition| | --- | --- | |id |Unique ID for the customer| |Gender |Gender of the customer| |Age |Age of the customer| |Driving_License |0 : Customer does not have DL, 1 : Customer already has DL| |Region_Code |Unique code for the region of the customer| |Previously_Insured |1 : Customer already has Vehicle Insurance, 0 : Customer doesn't have Vehicle Insurance| |Vehicle_Age |Age of the Vehicle | |Vehicle_Damage|1 : Customer got his/her vehicle damaged in the past. 0 : Customer didn't get his/her vehicle damaged in the past.| |Annual_Premium |The amount customer needs to pay as premium in the year| |Policy_Sales_Channel |Anonymized Code for the channel of outreaching to the customer ie. Different Agents, Over Mail, Over Phone, In Person, etc.| |Vintage |Number of Days, Customer has been associated with the company| |Response |1 : Customer is interested, 0 : Customer is not interested| - Test Data |Variable|Definition| | --- | --- | |id |Unique ID for the customer| |Gender |Gender of the customer| |Age |Age of the customer| |Driving_License |0 : Customer does not have DL, 1 : Customer already has DL| |Region_Code |Unique code for the region of the customer| |Previously_Insured |1 : Customer already has Vehicle Insurance, 0 : Customer doesn't have Vehicle Insurance| |Vehicle_Age |Age of the Vehicle | |Vehicle_Damage|1 : Customer got his/her vehicle damaged in the past. 0 : Customer didn't get his/her vehicle damaged in the past.| |Annual_Premium |The amount customer needs to pay as premium in the year| |Policy_Sales_Channel |Anonymised Code for the channel of outreaching to the customer ie. Different Agents, Over Mail, Over Phone, In Person, etc.| |Vintage |Number of Days, Customer has been associated with the company| - Submission |Variable|Definition| | --- | --- | |id |Unique ID for the customer| |Response |1 : Customer is interested, 0 : Customer is not interested| ## Evaluation Metric The evaluation metric for this hackathon is ROC_AUC score. ## Public and Private split The public leaderboard is based on 40% of test data, while final rank would be decided on remaining 60% of test data (which is private leaderboard) ## Guidelines for Final Submission Please ensure that your final submission includes the following: 1. Solution file containing the predicted response of the customer (Probability of response 1) 2. Code file for reproducing the submission, note that it is mandatory to submit your code for a valid final submission

### 背景信息 本客户为一家保险公司,现已向其客户提供健康保险。目前,该公司需要您的帮助来构建一个模型,以预测过去一年的保单持有人(客户)是否会对公司提供的车辆保险感兴趣。 保险合同是一种安排,保险公司据此承诺在支付一定保费的前提下,对特定的损失、损害、疾病或死亡提供赔偿。保费是客户需定期支付给保险公司的金额,以换取此项保证。 例如,您可能每年支付5000卢比的保费,以获得20万卢比的健康保险保障。这样,如果您不幸在该年度生病并需要住院治疗,保险公司将承担最高达20万卢比的住院费用等。现在,如果您想知道公司如何承担如此高的住院费用,而保费仅为5000卢比,那么这正是概率概念的体现。例如,像您一样,可能有100名客户每年支付5000卢比的保费,但其中只有少数人(例如2-3人)会在该年度住院,并非所有人。通过这种方式,每个人分担他人的风险。 与医疗保险类似,车辆保险要求客户每年向保险公司支付一定数额的保费,以便在车辆发生不幸事故时,保险公司向客户提供赔偿(称为‘赔偿金额’)。 构建一个预测客户是否会对车辆保险感兴趣的模型对于公司而言极其有益,因为公司可以据此相应地制定其沟通策略,以接触那些客户,并优化其商业模式和收入。 现在,为了预测客户是否会对车辆保险感兴趣,您拥有关于人口统计学(性别、年龄、地区代码)、车辆(车辆年龄、损坏情况)、保单(保费、来源渠道)等信息。 ## 数据描述 - 训练数据 | 变量 | 定义 | | --- | --- | | id | 客户的唯一标识符 | | Gender | 客户的性别 | | Age | 客户的年龄 | | Driving_License | 0 : 客户没有驾照,1 : 客户已有驾照 | | Region_Code | 客户所在地区的唯一代码 | | Previously_Insured | 1 : 客户已有车辆保险,0 : 客户没有车辆保险 | | Vehicle_Age | 车辆的年龄 | | Vehicle_Damage | 1 : 客户在过去有过车辆损坏记录,0 : 客户没有车辆损坏记录 | | Annual_Premium | 客户当年需要支付的保费金额 | | Policy_Sales_Channel | 用于接触客户的渠道的匿名代码,例如不同的代理人、邮件、电话、面对面等 | | Vintage | 客户与公司关联的天数 | | Response | 1 : 客户感兴趣,0 : 客户不感兴趣 | - 测试数据 | 变量 | 定义 | | --- | --- | | id | 客户的唯一标识符 | | Gender | 客户的性别 | | Age | 客户的年龄 | | Driving_License | 0 : 客户没有驾照,1 : 客户已有驾照 | | Region_Code | 客户所在地区的唯一代码 | | Previously_Insured | 1 : 客户已有车辆保险,0 : 客户没有车辆保险 | | Vehicle_Age | 车辆的年龄 | | Vehicle_Damage | 1 : 客户在过去有过车辆损坏记录,0 : 客户没有车辆损坏记录 | | Annual_Premium | 客户当年需要支付的保费金额 | | Policy_Sales_Channel | 用于接触客户的渠道的匿名代码,例如不同的代理人、邮件、电话、面对面等 | | Vintage | 客户与公司关联的天数 | - 提交 | 变量 | 定义 | | --- | --- | | id | 客户的唯一标识符 | | Response | 1 : 客户感兴趣,0 : 客户不感兴趣 | ## 评估指标 本黑客松的评估指标为ROC_AUC分数。 ## 公开和私有数据集划分 公开排行榜基于测试数据的40%,而最终排名将根据剩余的60%测试数据(私有排行榜)来决定。 ## 最终提交指南 请确保您的最终提交包括以下内容: 1. 包含客户预测响应(响应为1的概率)的解决方案文件 2. 用于重现提交的代码文件,请注意,提交代码对于有效提交是强制性的。
提供机构:
Kaggle
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作