Data Science for Good: Kiva Crowdfunding
收藏www.kaggle.com2018-03-02 更新2025-01-21 收录
下载链接:
https://www.kaggle.com/kiva/data-science-for-good-kiva-crowdfunding
下载链接
链接失效反馈官方服务:
资源简介:
[Kiva.org][1] is an online crowdfunding platform to extend financial services to poor and financially excluded people around the world. Kiva lenders have provided over $1 billion dollars in loans to over 2 million people. In order to set investment priorities, help inform lenders, and understand their target communities, knowing the level of poverty of each borrower is critical. However, this requires inference based on a limited set of information for each borrower.
In Kaggle Datasets' inaugural [Data Science for Good][2] challenge, Kiva is inviting the Kaggle community to help them build more localized models to estimate the poverty levels of residents in the regions where Kiva has active loans. Unlike traditional machine learning competitions with rigid evaluation criteria, participants will develop their own creative approaches to addressing the objective. Instead of making a prediction file as in a supervised machine learning problem, submissions in this challenge will take the form of Python and/or R data analyses using Kernels, Kaggle's hosted Jupyter Notebooks-based workbench.
Kiva has provided a dataset of loans issued over the last two years, and participants are invited to use this data as well as source external public datasets to help Kiva build models for assessing borrower welfare levels. Participants will write kernels on this dataset to submit as solutions to this objective and five winners will be selected by Kiva judges at the close of the event. In addition, awards will be made to encourage public code and data sharing. With a stronger understanding of their borrowers and their poverty levels, Kiva will be able to better assess and maximize the impact of their work.
The sections that follow describe in more detail how to participate, win, and use available resources to make a contribution towards helping Kiva better understand and help entrepreneurs around the world.
---
## Problem Statement
For the locations in which Kiva has active loans, your objective is to pair Kiva's data with additional data sources to estimate the welfare level of borrowers in specific regions, based on shared economic and demographic characteristics.
A good solution would connect the features of each loan or product to one of several poverty mapping datasets, which indicate the average level of welfare in a region on as granular a level as possible. Many datasets indicate the poverty rate in a given area, with varying levels of granularity. Kiva would like to be able to disaggregate these regional averages by gender, sector, or borrowing behavior in order to estimate a Kiva borrower’s level of welfare using all of the relevant information about them. Strong submissions will attempt to map vaguely described locations to more accurate geocodes.
Kernels submitted will be evaluated based on the following criteria:
**1. Localization** - How well does a submission account for highly localized borrower situations? Leveraging a variety of external datasets and successfully building them into a single submission will be crucial.
**2. Execution** - Submissions should be efficiently built and clearly explained so that Kiva’s team can readily employ them in their impact calculations.
**3. Ingenuity** - While there are many best practices to learn from in the field, there is no one way of using data to assess welfare levels. It’s a challenging, nuanced field and participants should experiment with new methods and diverse datasets.
---
## How to Participate and [Make a Submission »][3]
To be considered a participant in the Kiva Crowdfunding Data Science for Good Event, there are a few requirements:
1. **[Everyone must register and accept the rules by filling out this form][10]** (you'll need to be logged into your Kaggle account to view the form). This ensures you're a participant and also means you'll receive update emails from us about key deadlines and announcements throughout the event.
2. To submit a kernel for consideration in the main prize track, make sure it's public and **[submit it here][11]** (you'll need to be logged into your Kaggle account to view the form). [Read more details here][4].
3. To submit a kernel or dataset for consideration in the secondary prize track, all you need to do is make sure it's public and be a registered participant before the deadline.
---
## [Prizes and Eligibility »][5]
There is a total prize pool of $30,000 split into two tracks:
* Main prize track for the primary event objective: accurate and localized analyses or methods for assessing poverty levels. ($14,000; five winners total)
* Upvoted kernels and popular datasets to encourage public sharing of code and data ($16,000; 12 winners total)
**Main Prize Track**
Kiva will award $14,000 in total prizes to five winning authors who submit public kernels effectively tackling the objective by the deadline. These kernels must be submitted for consideration by May 15th, 2018.
**Upvoted Kernels and Popular Datasets**
There is also a separate prize track for public sharing of code and data to encourage ongoing collaboration. Awards of $1,000 each will also be made to authors of the eight top most upvoted kernels. And four awards of $2,000 each will go to the datasets published with the most upvoted kernels used with the event data.
[For more details about the prizes and eligibility click here][6].
---
## Timeline
All dates are 11:59PM UTC:
* **3 April 2018**: Kernels Award Announcement (Top 8 upvoted kernels)
* **3 April 2018**: First Datasets Award Announcement (Top 2 most used data sources published on Kaggle)
* **15 May 2018**: Challenge Deadline (Kernels for main prize must be submitted and made publicly available to be evaluated for a prize)
* **22 May 2018:** Winners of the primary prize track will be announced and second datasets award announcement (Second top 2 most used data sources published on Kaggle)
---
## Rules
To be eligible to win a prize in either of the above prize tracks, you must be:
* a registered account holder at Kaggle.com;
* the older of 18 years old or the age of majority in your jurisdiction of residence;
* not a resident of Crimea, Cuba, Iran, Syria, North Korea, or Sudan; and
* not a person or representative of an entity under [U.S. export controls or sanctions][9].
Your kernels and datasets will only be eligible to win if they have been made public on kaggle.com by the above deadline. All prizes are awarded at the discretion of Kiva, and Kiva reserves the right to cancel or modify prize criteria.
Unfortunately employees, interns, contractors, officers and directors of Kaggle Inc., and their parent companies, are not eligible to win any prizes.
---
Photo by [Aaron Burden][7] on [Unsplash][8].
[1]: https://www.kaggle.com/kiva
[2]: http://blog.kaggle.com/2017/11/16/introducing-data-science-for-good-events-on-kaggle/
[3]: https://www.kaggle.com/kiva/data-science-for-good-kiva-crowdfunding/discussion/49867
[4]: https://www.kaggle.com/kiva/data-science-for-good-kiva-crowdfunding/discussion/49867
[5]: https://www.kaggle.com/kiva/data-science-for-good-kiva-crowdfunding/discussion/49839
[6]: https://www.kaggle.com/kiva/data-science-for-good-kiva-crowdfunding/discussion/49839
[7]: https://unsplash.com/photos/blPTIZuBhD8?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText
[8]: https://unsplash.com/search/photos/charity?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText
[9]: https://www.treasury.gov/resource-center/sanctions/Programs/Pages/Programs.aspx
[10]: https://www.kaggle.com/data-science-for-good-kiva-crowdfunding-signup
[11]: https://www.kaggle.com/data-science-for-good-kiva-crowdfunding-submission
Kiva.org[1]是一家在线众筹平台,旨在为全球贫困和金融边缘人群提供金融服务。Kiva贷款人已向超过200万人提供了超过10亿美元的贷款。为了设定投资优先级,帮助贷款人获取信息,并了解其目标社区,了解每位借款人的贫困程度至关重要。然而,这需要对每位借款人有限的信息进行推断。在Kaggle数据集的首次[数据科学公益挑战][2]中,Kiva邀请Kaggle社区帮助他们构建更本地化的模型,以估算Kiva活跃贷款地区的居民贫困水平。与具有严格评估标准的传统机器学习竞赛不同,参赛者将发展自己的创新方法来应对这一目标。在这个挑战中,参赛者将提交Python和/或R数据分析,而不是像监督机器学习问题中那样提交预测文件,这些分析将采用Kaggle托管的基础于Jupyter Notebooks的工作平台。Kiva提供了一组过去两年发行的贷款数据集,并邀请参赛者使用这些数据以及来源外部公共数据集,以帮助Kiva构建评估借款人福利水平的模型。参赛者将撰写关于此数据集的内核,并将其作为解决方案提交,Kiva评委将在活动结束时选出五名获胜者。此外,还将颁发奖项以鼓励公开代码和数据共享。通过更深入地了解借款人和他们的贫困水平,Kiva将能够更好地评估并最大化其工作的影响力。以下部分详细描述了如何参与、获胜以及使用可用资源为帮助Kiva更好地理解和帮助全球企业家做出贡献。---
## 问题陈述
对于Kiva有活跃贷款的地区,你的目标是结合Kiva的数据和额外的数据源,根据共享的经济和人口统计特征,估计特定地区的借款人福利水平。
一个优秀的解决方案会将每个贷款或产品的特征与几个贫困映射数据集之一相连接,这些数据集表明了在尽可能细粒度的地区福利平均水平。许多数据集表明了特定地区的贫困率,具有不同的粒度。Kiva希望能够根据关于借款人的所有相关信息,对这些区域平均数进行分解,按性别、行业或借贷行为进行细分,以估计Kiva借款人的福利水平。优秀的提交将尝试将模糊描述的位置映射到更准确的地理编码。
提交的内核将根据以下标准进行评估:
**1. 本地化** - 提交如何很好地考虑高度本地化的借款人情况?利用各种外部数据集并成功地将它们构建到单个提交中将至关重要。
**2. 执行** - 提交应高效构建,并清晰地解释,以便Kiva团队可以轻松地将它们应用于其影响计算。
**3. 独创性** - 虽然可以从该领域学习到许多最佳实践,但使用数据评估福利水平的方法并非唯一。这是一个具有挑战性和细微差别的领域,参赛者应尝试新的方法和多种数据集。
---
## 如何参与和[提交内容」][3]
要被视为Kiva众筹数据科学公益活动的参与者,有以下要求:
1. **[每个人必须注册并接受规则,填写此表格][10]**(您需要登录Kaggle账户才能查看表格)。这确保您是参与者,并且您将收到有关整个活动关键截止日期和公告的更新电子邮件。
2. 要提交供考虑的主奖赛道内核,请确保它是公开的,并**[在此提交][11]**(您需要登录Kaggle账户才能查看表格)。[了解更多详情这里][4]。
3. 要提交供考虑的次奖赛道内核或数据集,只需确保它是公开的,并在截止日期前成为注册参与者即可。
---
## [奖项和资格」][5]
总奖金池为30,000美元,分为两个赛道:
* 主要奖赛道针对主要活动目标:准确和本地化的分析或方法,用于评估贫困水平。(14,000美元;共五名获胜者)
* 上传的内核和受欢迎的数据集以鼓励代码和数据公开共享。(16,000美元;共12名获胜者)
**主要奖赛道**
Kiva将向五名通过有效解决目标并在截止日期前提交公开内核的获胜作者颁发总计14,000美元的奖金。这些内核必须在2018年5月15日之前提交供考虑。
**上传的内核和受欢迎的数据集**
还有另一个奖项赛道,用于鼓励代码和数据的公开共享。将向八个最受欢迎的内核的作者各颁发1,000美元的奖项。此外,还将颁发四个2,000美元的奖项,以奖励使用活动数据的最高票选内核发布的数据集。
[有关奖项和资格的更多详情,请点击此处][6]。
---
## 时间表
所有日期均为UTC时间晚上11:59:
* **2018年4月3日**:内核奖项公告(最高8个点赞内核)
* **2018年4月3日**:第一次数据集奖项公告(Kaggle上发布的最高2个最常使用的数据源)
* **2018年5月15日**:挑战截止日期(主奖赛道内核必须提交并公开发布,以便进行奖项评估)
* **2018年5月22日**:主要奖赛道获胜者将公布,并宣布第二次数据集奖项公告(Kaggle上发布的第二次最高2个最常使用的数据源)
---
## 规则
要获得上述奖项赛道中的任何奖项,您必须符合以下条件:
* Kaggle.com的注册账户持有者;
* 年龄在18岁或您所在司法管辖区法定成年年龄以上;
* 不居住在克里米亚、古巴、伊朗、叙利亚、朝鲜或苏丹;
* 不是美国出口管制或制裁下的个人或实体的代表;
* 您的内核和数据集只有在截止日期前在kaggle.com上公开发布才有资格获奖。所有奖项均由Kiva酌情颁发,Kiva保留取消或修改奖项标准的权利。
遗憾的是,Kaggle Inc.的员工、实习生、承包商、官员和董事及其母公司的人员均无资格获奖。
[照片由[Aaron Burden][7]在[Unsplash][8]拍摄]。
提供机构:
Kaggle
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含Kiva.org过去两年的贷款记录,旨在通过数据科学方法评估借款人的贫困水平。数据集提供了贷款金额、用途、地区等详细信息,鼓励参与者结合外部数据源进行创新分析。
以上内容由遇见数据集搜集并总结生成



