Skip to content

Simulate data from a DAG X->Z->Y so that the correlation between X and Z is very large. Included both in a model predicting Y.

Notifications You must be signed in to change notification settings

tanishkasingh9/Multi-collinearity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Multi-collinearity

The provided analysis compares the performance of three models (OLS, Ridge, and PLS) in predicting the dependent variable (Y) using the independent variables (X) and (Z). The (R^2) scores, which measure the proportion of variance in (Y) explained by (X) and (Z), are very similar across all three models, indicating that approximately 93.4% of the variance in (Y) is accounted for by the predictors in each model. Similarly, the Mean Squared Error (MSE) scores, which measure the average squared difference between observed and predicted values of (Y), are also very similar among the models, suggesting low prediction error.

Given the consistency in (R^2) and MSE scores across the models, there doesn't appear to be a significant impact of multicollinearity on model performance in this scenario. Multicollinearity typically affects the stability and interpretability of coefficient estimates, but in this case, the models perform similarly regardless of the level of multicollinearity present. Therefore, while multicollinearity may exist in the data, it doesn't seem to substantially affect the ability of the models to predict (Y) accurately using (X) and (Z).

About

Simulate data from a DAG X->Z->Y so that the correlation between X and Z is very large. Included both in a model predicting Y.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published