Data Science For Good: Kiva Crowdfunding
Use Kernels to assess welfare of Kiva borrowers for $30k in prizes
@kaggle.kiva_data_science_for_good_kiva_crowdfunding
Use Kernels to assess welfare of Kiva borrowers for $30k in prizes
@kaggle.kiva_data_science_for_good_kiva_crowdfunding
Kiva.org is an online crowdfunding platform to extend financial services to poor and financially excluded people around the world. Kiva lenders have provided over $1 billion dollars in loans to over 2 million people. In order to set investment priorities, help inform lenders, and understand their target communities, knowing the level of poverty of each borrower is critical. However, this requires inference based on a limited set of information for each borrower.
In Kaggle Datasets' inaugural Data Science for Good challenge, Kiva is inviting the Kaggle community to help them build more localized models to estimate the poverty levels of residents in the regions where Kiva has active loans. Unlike traditional machine learning competitions with rigid evaluation criteria, participants will develop their own creative approaches to addressing the objective. Instead of making a prediction file as in a supervised machine learning problem, submissions in this challenge will take the form of Python and/or R data analyses using Kernels, Kaggle's hosted Jupyter Notebooks-based workbench.
Kiva has provided a dataset of loans issued over the last two years, and participants are invited to use this data as well as source external public datasets to help Kiva build models for assessing borrower welfare levels. Participants will write kernels on this dataset to submit as solutions to this objective and five winners will be selected by Kiva judges at the close of the event. In addition, awards will be made to encourage public code and data sharing. With a stronger understanding of their borrowers and their poverty levels, Kiva will be able to better assess and maximize the impact of their work.
The sections that follow describe in more detail how to participate, win, and use available resources to make a contribution towards helping Kiva better understand and help entrepreneurs around the world.
For the locations in which Kiva has active loans, your objective is to pair Kiva's data with additional data sources to estimate the welfare level of borrowers in specific regions, based on shared economic and demographic characteristics.
A good solution would connect the features of each loan or product to one of several poverty mapping datasets, which indicate the average level of welfare in a region on as granular a level as possible. Many datasets indicate the poverty rate in a given area, with varying levels of granularity. Kiva would like to be able to disaggregate these regional averages by gender, sector, or borrowing behavior in order to estimate a Kiva borrower’s level of welfare using all of the relevant information about them. Strong submissions will attempt to map vaguely described locations to more accurate geocodes.
Kernels submitted will be evaluated based on the following criteria:
1. Localization - How well does a submission account for highly localized borrower situations? Leveraging a variety of external datasets and successfully building them into a single submission will be crucial.
2. Execution - Submissions should be efficiently built and clearly explained so that Kiva’s team can readily employ them in their impact calculations.
3. Ingenuity - While there are many best practices to learn from in the field, there is no one way of using data to assess welfare levels. It’s a challenging, nuanced field and participants should experiment with new methods and diverse datasets.
To be considered a participant in the Kiva Crowdfunding Data Science for Good Event, there are a few requirements:
There is a total prize pool of $30,000 split into two tracks:
Main Prize Track
Kiva will award $14,000 in total prizes to five winning authors who submit public kernels effectively tackling the objective by the deadline. These kernels must be submitted for consideration by May 15th, 2018.
Upvoted Kernels and Popular Datasets
There is also a separate prize track for public sharing of code and data to encourage ongoing collaboration. Awards of $1,000 each will also be made to authors of the eight top most upvoted kernels. And four awards of $2,000 each will go to the datasets published with the most upvoted kernels used with the event data.
For more details about the prizes and eligibility click here.
All dates are 11:59PM UTC:
To be eligible to win a prize in either of the above prize tracks, you must be:
Your kernels and datasets will only be eligible to win if they have been made public on kaggle.com by the above deadline. All prizes are awarded at the discretion of Kiva, and Kiva reserves the right to cancel or modify prize criteria.
Unfortunately employees, interns, contractors, officers and directors of Kaggle Inc., and their parent companies, are not eligible to win any prizes.
Photo by Aaron Burden on Unsplash.
Anyone who has the link will be able to view this.