[🔥Hot Takes] Deep learning outperforms linear regression for causal inference and tabular data??

It's a deep debate.

Mar 02, 2023

The paper that I just read suggests that DL can outperform linear regression for causal inference matters. Of course I don’t put that paper until the end of this article because I’m like a grocery store that puts milk a mile a way from the cash register. While you’re trudging to through the aisles of this email, please enjoy some tweets:

Hot Takes

First, a word from Jeremy Howard, multiple Kaggle Grandmaster and FastAI creator (June 2021):

Jeremy Howard @jeremyphoward

pytorch-widedeep, deep learning for tabular data IV: Deep Learning vs LightGBM by Javier Rodriguez Zaurin in @TDataScience

towardsdatascience.compytorch-widedeep, deep learning for tabular data IV: Deep Learning vs LightGBMHere we go with yet another post in the series. This is the fourth of the series. The previous three posts, and the original version of this post are hosted in my own blog, just in case. I started…

This Elvis person seems conflicted, because at first Elvis declares the promise of DL (Oct 2021):

elvis @omarsar0

Deep learning seems unstoppable! I'm particularly impressed by the recent progress of deep learning on tabular data. This new survey paper provides an overview of the SOTA deep learning methods on tabular data. A great read for students and practitioners. arxiv.org/abs/2110.01889

Then this drop in Apr 2022:

elvis @omarsar0

As an ML practitioner, you shouldn't be surprised by how far a simple model can take you. The current ML research provides evidence for this. Take a look at this recent paper where XGBoost outperforms several deep learning approaches on tabular data. arxiv.org/abs/2110.01889

How should I interpret this paper in light of the first tweet?

In Nov 2022, some fancy XGBoost influencers get in the mix:

Mark Tenenholtz @marktenenholtz

99.9% of recent research on deep learning for tabular data doesn’t work in practice. But! I seriously appreciate that it exists and the work is being done. Tabular data has otherwise been left behind.

with this reply from Bojun:

Bojan Tunguz @tunguz

@marktenenholtz If only 0.01% of that research were dedicated to improving GBTs for tabular data, we'd have even more advanced algos than what we have now.

I mean, I look to these people to just tell me what to think. Why can’t they agree?

Finally, a paper

Here’s a paper “Evaluating Uses of Deep Learning Methods for Causal Inference” which concludes DL can outperform LR in simulations:

Logistic regression (LR) is a popular method that is used for estimating causal effects in observational studies using propensity scores. We examine the use of deep learning models such as the deep neural network (DNN), PropensityNet (PN), convolutional neural network (CNN), and convolutional neural network-long short-term memory network (CNN-LSTM) to estimate propensity scores and evaluate causal inference. We conducted studies using simulated data with different sample sizes (N = 500, N = 1000, N = 2000), 15 covariates, a continuous outcome and a binary exposure. These data were used in seven scenarios that were different in the degree of nonlinearity and nonadditivity associations between the exposure and covariates. Estimation of propensity scores was considered a classification task and performance metrics that included classification accuracy, receiver operating characteristic curve area under the curve (AUCROC), covariate balance, standard error, absolute bias, and the 95% confidence interval coverage were evaluated for each model. Our simulation results show that deep learning models (CNN, DNN, and CNN-LSTM) outperformed LR in the estimation of the propensity score. CNN and CNN-LSTM achieved good results for covariate balance, classification accuracy, AUCROC, and Cohen’s Kappa. Although LR provided substantially better bias reduction, it produced subpar performance based on classification accuracy, AUCROC, Cohen’s Kappa, and 95% confidence interval coverage compared to the deep learning models. The results suggest that deep learning methods, especially CNN, may be useful for estimating propensity scores that are used to estimate causal effects.

Note: I’m note sure why they didn’t try XGBoost. That’s like assessing your stock returns portfolio without referencing the S&P. Sure, you got a 15% return, but the S&P got a 20%…sooooo…

Also, I take gripe when people don’t do good feature engineering. It’s not hard to do good feature engineering. And of course XGBoost/DNN may outperform a basic LR because they don’t give the LR the love it requires.

Lastly, this is a simulation. Why don’t they just use real data?? I haven’t simulated data since college, because nobody should care about theoretical approximations when the real data is right in front of you. Either it outperforms or it doesn’t. No theory needed.

Till the next paper/tweet…

Data Science Daily

[🔥Hot Takes] Deep learning outperforms linear regression for causal inference and tabular data??

It's a deep debate.

Hot Takes

Finally, a paper