In Chapter 5 of A Manual for Cultural Analysis we discuss how culture can result in emergent treelike patterns at the level of comparisons between groups. These nested patterns arise because many of our most enduring socially learned behaviors change very slowly (that’s a tautology of course, things that change quickly can’t be enduring). Anyway, when things change slowly, and as groups form and dissolve, this tends to result in fairly nested treelike branching patterns for a lot of culture at a very high group-y level.
This can result, however, in spurious findings for correlations between traits that in fact have no functional relationship but instead are both inherited through the same cultural pathways. In the cross-cultural literature this is known as Galton’s problem. See the manual if you want a discussion of this and why it is important.
For an even fuller discussion of Galton’s problem, you can check out my new book chapter. Mine is chapter 8 entitled “Dealing with Culture as Inherited Information.” Hopefully people’s university libraries are picking up copies of this, because as a whole it really is an excellent book. Shoot me an email if you are having trouble finding it.
The data supplement for that book is my code and is at the bottom of the books Wiley page, just scroll all the way down. You can download the supplement for free without buying the book. Go ahead and download it and you have a full R implementation for a variety of methods proposed to solve Galton’s problem. In the code I first show how to fit simple linear regression, which makes no correction at all for Galton’s problem. Then I show how to put in principal components that attempt to fix Galton’s problem, then network autoregression, mixed hierarchical models that use random effects for clumps in the network, and finally phylogenetic regression.
The book chapter shows simulation results that demonstrate multiple of these methods work when diffusion is the cultural process, but only the phylogenetic regression corrects for inheritance as a process. Note: in this case phylogenetic regression also works for diffusion because the network is highly treelike (i.e. nested). This is crucial! Only phylogenetic regression works irrespective of whether the main cultural process is diffusion or inheritance. Since we almost never know the main cultural process a priori, I recommend for treelike networks that we use phylogenetic regression and simply do not interpret a significant role for the phylogeny as necessarily indicative of inheritance. It could indicate diffusion on a treelike network.
FYI, I have a new preferred implementation for the phylogenetic model as compared to when I made the supplement for that book. My new preferred method is phylolm function in the package of the same name. It is much easier to control whether the phylogenetic parameters like lambda are bounded or unbounded in phylolm as compared to fitting the same model with gls. The gls way is what my data supplement code shows. To fit phylolm, you still use ape package to load in the phylogeny. Then give the phylogeny object itself straight to phylolm as a parameter (see the phylolm help file).
One quirk of phylolm is that is does not print BIC in the summary. I’ve advocated for BIC as a way to pick models. So, you can get BIC if you use the AIC function. Suppose my.tree is a fitted phylom model. You type AIC(my.tree,k=log(N)) where N is your sample size. This converts the AIC into the BIC. The principle difference in the two is AIC uses a penalty of 2 all the time, while BIC uses a penalty that is log(N). You can learn about his yourself with the AIC function help page.
OK, so between this blog and my prior one I have provided implementation for 1) determining which network is important for your cultural trait and 2) correcting for Galton’s problem on your network if it is highly treelike. That still leaves a hole in the analytic pipeline if your network is not treelike. What to do then? Don’t worry, I’m on it! I have a set of NIH-funded projects about physician networks, which are highly non-treelike. I have a paper in preparation right now that shows the phylogenetic method predictably fails under this condition to correct Galton’s problem on a messy non-tree network. In fact, all the previous methods fail! So, I’m inventing a couple new methods and hopefully will have that paper submitted soon.
This can result, however, in spurious findings for correlations between traits that in fact have no functional relationship but instead are both inherited through the same cultural pathways. In the cross-cultural literature this is known as Galton’s problem. See the manual if you want a discussion of this and why it is important.
For an even fuller discussion of Galton’s problem, you can check out my new book chapter. Mine is chapter 8 entitled “Dealing with Culture as Inherited Information.” Hopefully people’s university libraries are picking up copies of this, because as a whole it really is an excellent book. Shoot me an email if you are having trouble finding it.
The data supplement for that book is my code and is at the bottom of the books Wiley page, just scroll all the way down. You can download the supplement for free without buying the book. Go ahead and download it and you have a full R implementation for a variety of methods proposed to solve Galton’s problem. In the code I first show how to fit simple linear regression, which makes no correction at all for Galton’s problem. Then I show how to put in principal components that attempt to fix Galton’s problem, then network autoregression, mixed hierarchical models that use random effects for clumps in the network, and finally phylogenetic regression.
The book chapter shows simulation results that demonstrate multiple of these methods work when diffusion is the cultural process, but only the phylogenetic regression corrects for inheritance as a process. Note: in this case phylogenetic regression also works for diffusion because the network is highly treelike (i.e. nested). This is crucial! Only phylogenetic regression works irrespective of whether the main cultural process is diffusion or inheritance. Since we almost never know the main cultural process a priori, I recommend for treelike networks that we use phylogenetic regression and simply do not interpret a significant role for the phylogeny as necessarily indicative of inheritance. It could indicate diffusion on a treelike network.
FYI, I have a new preferred implementation for the phylogenetic model as compared to when I made the supplement for that book. My new preferred method is phylolm function in the package of the same name. It is much easier to control whether the phylogenetic parameters like lambda are bounded or unbounded in phylolm as compared to fitting the same model with gls. The gls way is what my data supplement code shows. To fit phylolm, you still use ape package to load in the phylogeny. Then give the phylogeny object itself straight to phylolm as a parameter (see the phylolm help file).
One quirk of phylolm is that is does not print BIC in the summary. I’ve advocated for BIC as a way to pick models. So, you can get BIC if you use the AIC function. Suppose my.tree is a fitted phylom model. You type AIC(my.tree,k=log(N)) where N is your sample size. This converts the AIC into the BIC. The principle difference in the two is AIC uses a penalty of 2 all the time, while BIC uses a penalty that is log(N). You can learn about his yourself with the AIC function help page.
OK, so between this blog and my prior one I have provided implementation for 1) determining which network is important for your cultural trait and 2) correcting for Galton’s problem on your network if it is highly treelike. That still leaves a hole in the analytic pipeline if your network is not treelike. What to do then? Don’t worry, I’m on it! I have a set of NIH-funded projects about physician networks, which are highly non-treelike. I have a paper in preparation right now that shows the phylogenetic method predictably fails under this condition to correct Galton’s problem on a messy non-tree network. In fact, all the previous methods fail! So, I’m inventing a couple new methods and hopefully will have that paper submitted soon.