causal inference
Last modified on May 09, 2022
observation
Links to “observation”
machine learning
Supervised learning is an example of observational inference – we’re just looking for associations between variables \(X\) and \(Y\). Aka, we’re just learning \(P(Y|X)\).
I feel like this thread captures a really interesting divide / contrast of philosophies in machine learning research:
Researchers in speech recognition, computer vision, and natural language processing in the 2000s were obsessed with accurate representations of uncertainty.
— Yann LeCun (@ylecun) May 14, 2022
1/N
My goal now is to deeply understand the issues at hand in this thread. I found his mention of factor graphs in the shift to reasoning and planning AI was thought-provoking. I feel that causality and factor graphs and Bayesian and all that are very important. I just don’t know quite enough to put the pieces together yet.
intervention
Links to “intervention”
ablation studies
Ablation studies are effectively using interventions (removing parts of your system) to reveal the underlying causal structure of your system. Francois Chollet (creator of Keras) writes about this being useful in a machine learning context:
Ablation studies are crucial for deep learning research -- can't stress this enough.
— François Chollet (@fchollet) June 29, 2018
Understanding causality in your system is the most straightforward way to generate reliable knowledge (the goal of any research). And ablation is a very low-effort way to look into causality.
Nancy Kanwisher (A roadmap for research > Causal role?)
fMRI is nice…but what’s the causal role of these regions? We don’t just want correlations of brain activations & activities.
We need experiments where we “poke” part of the system – intervention. Schalk et al.
If he’s looking at a face, the face changes. If he’s looking at something else, it adds a face to that object.
“Poking the face area” results in weird, weird face stuff happening to brain patient
Stimulated color regions – he saw a rainbow (wtfff)
Towards Causal Representation Learning (Independent mechanisms)
Hypothesis: We can explain the world by the composition of informationally independent pieces/modules/mechanisms. (Note: not statistically independent, but independent s.t. any causal intervention would affect just one such mechanism.)
Towards Causal Representation Learning (Causal induction from interventional data)
How to handle unknown intervention? infer it.
counterfactual
Links to “counterfactual”
Counterfactual Generative Networks
Neural networks like to “cheat” by using simple correlations that fail to generalize. E.g., image classifiers can learn spurious correlations with texture in the background, rather than the actual object’s shape; a classifier might learn that “green grass background” => “cow classification.”
This work decomposes the image generation process into three independent causal mechanisms – shape, texture, and background. Thus, one can generate “counterfactual images” to improve OOD robustness, e.g. by placing a cow on a swimming pool background. Related: generative models counterfactuals
Counterfactual Generative Networks
Neural networks like to “cheat” by using simple correlations that fail to generalize. E.g., image classifiers can learn spurious correlations with texture in the background, rather than the actual object’s shape; a classifier might learn that “green grass background” => “cow classification.”
This work decomposes the image generation process into three independent causal mechanisms – shape, texture, and background. Thus, one can generate “counterfactual images” to improve OOD robustness, e.g. by placing a cow on a swimming pool background. Related: generative models counterfactuals