Phenotypic and genetic correlations--the two serpents of the caduceus
A preprint that I read recently had an example demonstrating that even if you statistically force the phenotypes to be orthogonal, the genetic correlations between them persist. In fact, I’ve seen this in my own work. I constructed orthogonal principal components from school grades in various subjects, and found significant genetic correlations across them. This reminds me of a question that I—and many others might—have been pondering for a while: do the genetic correlations between phenotypes represent true biological correlations or simply the phenotypic correlations that were unaccounted for in the GWASs from which the summary statistics originate or perhaps, a mixture of both? The last one seems most reasonable. Although the preprint here demonstrates that genetic correlations persist beyond phenotypic correlations, it cannot be taken as a proof that the genetic correlations being captured here represent true genetic overlaps, meaning, the underlying biological mechanisms are shared across the phenotypes. This is because the genetic correlations capture the phenotypic correlations beyond what are measurable in the sample at our hands. This is both good and bad.
Good because it enables researchers to deduce phenotypic correlations between phenotypes that cannot be easily measured in the same sample due to time or ethical constraints. Bad because it doesn’t enable researchers to differentiate simple phenotypic correlations from true genetic correlations. A nice example for this is the genetic correlations that reflect the comorbidity between two disorders. A person can be comorbid for two diseases either because the person is truly suffering from both the disorders or the person was misdiagnosed for one of the disorders.
Thanks to Mendelian Randomisation (MR), which has circumvented some the cons of the genetic correlation analysis (but remember, the MR has its own flaws.) The MR works really well when the biological mechanisms underlying the phenotypes are clear. A very good example is the non-causal association between HDL and myocardial infarction. However, when it comes to behavioural phenotypes, the MR is, I think, as much blindfolded as the genetic correlation analysis is.
Sometimes phenotypes tend to have discordant genetic and phenotypic correlations. For e.g. there is a strong negative phenotypic correlation between schizophrenia and educational attainment, but the genetic correlation between the two is close to zero. This hints that the relationship between schizophrenia risk variants and cognition is complex. In fact, in my PhD project, I found that the schizophrenia risk variants correlate positively with verbal skills, but negatively with numerical skills. Hence, it is likely that the zero genetic correlation between schizophrenia and educational attainment is due to that some schizophrenia risk variants correlate positively and some, negatively with educational attainment leading to cancellation of effects. When more sophisticated statistical methods arrive in the future, such complex correlations between phenotypes such as schizophrenia and educational attainment can be disentangled.











