A credit risk model for predicting default on Educational Loans

Educational loan or student loan is a very popular financial instrument for funding higher education in developed countries like USA, Canada, UK, Finland, Hungary; emerging markets like Thailand and China also have fairly long experience in this field. But in the Indian context, a policy on educational loan schemes was first announced in the Union Budget 2001-02 and a model educational loan scheme was devised and subsequently approved by the Government of India. The aim of the scheme was to provide financial support from the banking system to deserving/meritorious students for pursuing higher education in India and abroad through educational loans granted collateral-free and priced at the interest rate of the primary lending rate (PLR). However, the loan utilization by prospective students has been very low till as late as 2008. Huge enhancement in fees across different colleges has compounded the interest burden and the outstanding loan amount to be repaid has almost doubled with an increased repayment burden on students. It is estimated that 21% of the persons had cited financial constraints as reasons for not pursuing education beyond secondary levels.

The basic characteristics of educational loans are: 75% of loans are collateral-free, unsecured, with long repayment periods and long-term in nature. These are quite similar to infrastructure /project loans where no cash flows are generated by graduated students during the moratorium period (the study period plus additional time period up to a maximum of one year). These inherent characteristics make the educational loans riskier than other personal or commercial loans. Further the educational loan being long-term in nature, and there being no periodic submission of revised financial information by the borrower unlike in the case of business loans, the lending bank is often unable to maintain any meaningful relationship with the student borrowers during the study period. The incidence of default among educational loans is increasing and it is quite possible that banks may restrict further exposure to educational loans which will severely affect the supply of educational loans to deserving candidates.

The main objective of this research project is the quantification of default risk and development of credit scoring models; it is intended to test the educational loan data with various linear models and non-linear machine learning models. Another objective is to develop a proof-of-concept software solution for credit risk, comprising a predictive model on the basis of both historical data and simulated macro-economic factors. The project proposes to use the real life granular data of educational loans sanctioned by banks during 2009 to 2014 and the default data from 2014 onwards. Several variables are being used in the development of the model such as the type of course, duration of the course, gender, social back ground, minority status, rural/urban and metro residency, type of college (whether accredited or not), and socio-economic classification of districts.