Individual Causal Structure Learning from Population Data

Abstract

Learning the causal structure of each individual plays a crucial role in neuroscience, biology, and so on. Existing methods consider data from each individual separately, which may yield inaccurate causal structure estimations in limited samples. To leverage more samples, we consider incorporating data from all individuals as population data. We observe that the variables of all individuals are influenced by the common environment variables they share. These shared environment variables can be modeled as latent variables and serve as a bridge connecting data from different individuals. In particular, we propose an Individual Linear Acyclic Model (ILAM) for each individual from population data, which models the individual’s variables as being linearly influenced by their parents, in addition to environment variables and noise terms. Theoretical analysis shows that the model is identifiable when all environment variables are non-Gaussian, or even if some are Gaussian with an adequate diversity in the variance of noises for each individual. We then develop an individual causal structures learning method based on the Share Independence Component Analysis technique. Experimental results on synthetic and real-world data demonstrate the correctness of the method even when the sample size of each individual’s data is small.

Type
Publication
In Proceedings of the International Joint Conference on Artificial Intelligence