Categories
Uncategorized

Blog Post 11

Above is the same logistic regression model from the last blog post.

log(p/(1-p))= -3.578+ 1.118*revol_util + (-0.00834)*dti

P is the probability of the person having public record of bankruptcies

Multicollinearity Test:

First we have to test to see if our model suffers from multicollinearity. As we can see in the screenshot from R below, our model barely suffers from multicollinearity as the relationship is just over the threshold of 0.25. This means that there is a slight correlation between the 2 predicting factors of revolving credit utility ratio and monthly debt to monthly income ratio, but I am not worried that this will affect my model.

Independent Errors:

I can confidently say that my dataset has independent data-points, because every loan is different and there are probably not many repeat customers within this dataset.

Complete Information:

As you can see in the histograms below, we can see that there is a wide range of data for the DTI and the Revol_Util, but there is not as much data diversity within the realm of public record bankruptcies. I did see think this would be a problem with my dataset, because being bankrupt is a fairly uncommon event. Although this is a problem, I do believe that my dataset will have enough variation within the bankruptcies to complete the tasks for this assignment.

Complete Separation:

As we can see in the plots below, we can easily tell by looking at the plot that there is nothing close to complete separation within our data. This is beneficial to my model as it proves that we have large range of data within our binary variable of bankruptcies.

Large Sample Size:

Finally, we will test for a large sample size. As we can tell in the dimensions of my data-set, we do not suffer from this at all as we have close to 40,000 observations, which is plenty of data to work with and draw relationships from.

In conclusion, I believe that my dataset does not fail any of these assumptions as there are logical explanations which do not raise any serious concerns about my data.

Leave a comment

Design a site like this with WordPress.com
Get started