9 problems with Real World Regression

This list comes from the Coursera Data Analysis Course.

Linear and Logistic Regression are some of the most common techniques applied in data analysis. Here is a list of possible problems with regression in the real world.

  1. Confounders – variable that is correlated with both the outcome and other variables in the model
  2. Complicated Interactions – how do the covariates interact
  3. Skewness – is the data not evenly distributed, heavy to one side or the other
  4. Outliers – data points that don’t fit the pattern
  5. Non-linear Patterns – not all datasets can be fit with a straight line
  6. Variance Changes
  7. Units/Scale issues – make sure the units are standard across the model
  8. Overloading Regression – too much complexity
  9. Correlation does not imply Causation

What other problems do you find when using Regression on real-world data

Do you know of other problems that are missing.






4 responses to “9 problems with Real World Regression”

  1. Scott (@ScottOrz) Avatar

    Small Sample – absence of sufficient data to fit a regression model.

    1. Ryan Swanstrom Avatar

      That is a common problem. Professor Jeff Leak did not add that to his list. I wonder if that problem is not specific to Regression, because all statistical/machine learning models suffer when not enough data is present. I would agree with you though; small sample size can be a problem when doing any data analysis.

      Thanks for commenting,

Leave a Reply