A step-by-step overview of how banks and lenders can build a business case around machine learning, including direct savings and volume opportunities 💵

Loan underwriting is increasingly critical and challenging for consumer and small business lenders who face new competitors, increasing automation requirements and price-sensitive consumers. Manual underwriting is giving way to automated credit assessments that offer greater speed, consistency, compliance and accuracy, and risk models that quantify the creditworthiness of applicants are core to these assessments.

Powerful underwriting models have cascading benefits, including empowering lenders to reject the highest risk customers, approve more low risk customers and deploy risk-based pricing to ensure positive selection bias. However, most lenders continue to rely on generic risk scores (e.g. FICO® or VantageScore®), which tend to be “okay” for everyone and “excellent” for no one.

Lenders can address this problem by incorporating machine learning into their credit underwriting. Relative to traditional modeling techniques, machine learning models are more powerful and can be deployed much more quickly, relying on computational power instead of time-consuming manual analysis.

Not surprisingly, a wide range of lenders are excited about the potential of machine learning. However, they often struggle to build a business case around the opportunity – it’s obvious that machine learning will be useful, however the financial benefit must be weighed against the costs.

DigiFi works with lenders to deploy machine learning models and has developed a framework for measuring the direct financial impact that machine learning can have on an organization. This post explores our approach and the metrics we present to our customers.

Assessing the Financial Impact

The decision to use machine learning for credit underwriting primarily comes down to the expected impact on loan portfolio performance. We focus on accurately quantifying two key factors:

1. $ Annual Savings from identifying high-risk borrowers that are being missed by the current underwriting (i.e. reduce default rates)

2. % Volume Increase by finding applications are currently rejected but are actually low risk (i.e. increase approval rates)

The Savings Opportunity (Reduced Default Rates)

Generic risk scores, such as FICO® or VantageScore®, are generally correct but incorrectly assess some borrowers. Finding high-risk borrowers that traditional approaches miss can lead to direct savings, and machine learning models make this possible.

Once a machine learning model is trained to predict default rates, it’s easy to find the loans that are above your risk thresholds. Based on the maximum interest rate you charge and your risk appetite, you can back into the maximum default rate that you’re willing to accept. Removing the portion of loans below this threshold drives savings.

The Challenge: There’s a savings/volume trade-off here. Our goal is to identify the sweet spot where a small amount of reduced volume is traded for significant savings in losses. We target a 5-15% reduction in overall defaults.  

The Volume Opportunity (Increased Approval Rates)

The flipside of the coin is increased volume. Machine learning excels at finding low-risk customers that traditional underwriting techniques miss by incorporating a much wider set of credit variables within the predictive model.  

To quantify this volume we identify applications that are currently being rejected but that the machine learning model indicates are actually creditworthy. This typically results in at least a 10% volume opportunity and in some cases significantly more.

The Challenge: To achieve this volume lift, other credit requirements may need to be adjusted. In addition to analyzing the overall volume potential, we work with our customers to implement “safe growth” credit policy modifications to help achieve higher origination volumes.

Evaluating Model Strength

The financial impacts outlined above are only possible if the machine learning model is more accurate than your current underwriting. Underwriting models are generally judged based on their risk-ranking strength, and the underwriting model that can most accurately discriminate between high-risk and low-risk applicants will be most effective in meeting the goals of minimizing losses and maximizing volume.

One metric we use to evaluate model strength is the Area Under the Receiver Operating Curve (tested on “out of sample” data that was not used to train the model). In the example below, the model shows clear outperformance over the comparison score, implying that the it finds high- and low-risk applications better than the existing underwriting model.

It’s also helpful to compare different types of machine learning algorithms to determine which is strongest on a given dataset. For example, when DigiFi trains a predictive model for our customers, we typically test many options and select the top performing model. In the example below, the Neural Network emerged as the top model and was therefore selected.

Once the overall predictive power of the model has been established, it’s a good practice to examine the model in more detail to ensure it has generalized well from the training data to make sensible risk predictions. The following are effective ways to do so:

  • Univariate analysis: Examine the relationship between the model’s risk score and various credit variables. For variables that are known to be generally predictive of credit risk, such as income or revolver utilization, we should see a directional relationship between the variable and the risk score. This confirms that the risk score is correlating sensibly with known credit predictors and not identifying spurious relationships that may not generalize well.
  • Time Series Analysis: Test to see whether the default rate on applications is consistent with the risk score assigned and the change in default rates is relatively smooth across the testing set. This ensures that the model is accurately predicting default rates across the full spectrum of scores and time periods.

Tackling Compliance Concerns

Your machine learning plan should also address key compliance concerns. These typically include the following questions:

1) Is the machine learning model ECOA (Fair Lending) compliant?

We suggest using only credit bureau data, application information and data from trusted 3rd-parties within the machine learning model. Since this information is already being used for your underwriting, it is typically acceptable. Concerns often arise if you try to use non-traditional data within your machine learning model, which we do not suggest.

2) What are the data security risks?

We suggest excluding personally identifiable information (“PII”) from both the machine learning training process and the underwriting process. Further, we suggest using a machine learning platform that does not save data, which greatly limits data risks (i.e. the data passes through but isn’t saved).

3) Can we explain the decisions or is it a “black box”?

We suggest using a vendor with decision explainability. This is detailed in our recent blog post Explainable Machine Learning to Meet Bank-Level Compliance Requirements, and is provided by a few leading machine learning platforms, including DigiFi. Every decision we produce includes a Risk Score, Predicted Default Rate and Decision Explanation, making the decisions easy to understand and to explain to consumers on regulatory forms.

Final Thoughts

As lenders assess new underwriting techniques to meet the demands of an increasingly competitive marketplace, understanding the potential impact of machine learning is a critical step. Machine learning underwriting models are becoming more common at leading financial institutions, however many lenders continue to rely on traditional risk scores and basic credit rules. A large opportunity exists for these companies to use custom machine learning models trained on the proprietary data to significantly reduce defaults and increase approvals, driving bottom line impact for the organization.


About DigiFi

DigiFi is a technology company that helps lenders make better underwriting decisions.

DigiFi provides underwriting automation solutions, including credit policy digitization and machine learning models that help lenders reduce default rates and approve more borrowers. DigiFi’s solutions are powered by our proprietary DecisionVision platform, which we use to train and deploy machine learning models that drive unparalleled underwriting decisions that go way beyond traditional risk scores.