Check out my new article written together with RAND colleague Doug Ligor about applying Rawlsian political philosophy to regulation of outer space. We are rapidly approaching the point when we will cross over the actual veil of ignorance, which we argue will result necessarily in less fair rules overall. Of course, if the parties who might make rules from behind the veil now, all continue to believe they will be the ones to win big in the near future, then they are unlikely to make rules from behind the veil. Everyone thinking they will be the winner, however, is analogous to how 65% of Americans believe they are smarter than average.
While on vacation I had some time to read. I recently discovered the life and writings of H.P. Lovecraft, and I read his stories The Call of Cthulhu and The Dunwich Horror.
A few things struck me about these stories. First, The Call of Cthulhu is among the best remixes of anthropological tropes into a fantastical story that I've ever read. In that regard I would put it on par with the work of Ursula K. Le Guin (who I expect might not appreciate the comparison given Lovecraft's notorious racism, but anyway). Among Le Guin's stories, which I've read about six, I really like The Lathe of Heaven. It's a well told story that has some ideas worth pondering.
Back to Lovecraft, one thing that struck me is that both the stories I read seem interpretable on one level as just slightly disguised critique of Christianity. At least my casual google searching did not find this aspect of his writing mentioned in the literary commentary on his work. Both stories use a technique that elides ideas about extra-terrestrials with Christian theology and ideas about the demonic. Those three are sort of blended together, but to the ultimate effect of inverting core Christian theological or scriptural references from something positive (according to Christians) into something demonic. As the mystery unravels in The Call of Chthulhu, it finally culminates with the scene where Cthulhu is witnessed but his appearance is incomprehensible (it drives men mad). His island has impossible angles on it. He existed outside of our time and space. Really Cthulhu reads almost like a straightforwardly transposed vision of how many negative theologians, from Neoplatonists through Christians to Hindus and Daoists, talk about God. Negative theology is also called apophatic theology, and means theology that reasons about the divine through understanding what it is not, rather than affirming what it is. Last, Cthulhu will emerge from his tomb to liberate his cult followers, which would seem to parody ideas in from Christianity.
The Dunwich Horror is in some ways a simpler story and perhaps less elegant than Cthulhu. It reads a bit more like pulpy horror. It also contains a straightforward Christianity reference in its climax when the demonic extraterrestrial thing, in the process of being banished by the hero's of the story, from up on a hilltop calls out in suffering "Father" in a manner that is very reminiscent of the crucifixion scene. Here though the thing calling out is an extraterrestrial demon-spawn bent on the destruction of all life as we know it in order to repopulate the earth with incomprehensibly (to us) alien life forms.
A few thoughts I had reading this material is in my casual searching I didn't immediately notice literary criticism that had made much mention of these overt Christianity connections in Lovecraft's work. Lovecraft was an atheist, and at least in the two works I read, as I've described, these references are used as a parody or one might argue blasphemy of Christianity - which is clever because it makes them stories about blasphemy that also are themselves a different blasphemy. The religion connection, perhaps not appreciated, reminded me of the way the strong religion connection in Stoker's Dracula was not always appreciated. Dracula can be analyzed from all kinds of angles, but there's a very strong case to be made that at least for Stoker himself it was essentially a veiled pro-Catholic manifesto (Stoker may have been a closeted Catholic). Another curious parallel between the two is Stoker was also deeply racist - The Lair of the White Worm is one of the most racist novels I've ever read.
Thus, at one level these stories by Lovecraft struck me as rather straightforward, even if somewhat veiled, anti-religious manifestos, in much the way Stoker's Dracula is a pro-Catholic manifesto. The simple moral of Lovecraft's stories (at least the ones I read) is that to engage in any thought that seeks to transcend ordinary experience is to risk madness. Clearly the unique role of madness in Lovecraft's stories may be related to both of his parents having been committed to a mental institution.
A last thought I had about this material is that Lovecraft displays the a tendency I have noticed in other people who are primarily trained in physics (rather than in biology or social science) to take seriously the idea that extra-terrestrial lifeforms would transcend all human social and moral norms. Lovecraft stated this expressly:
Now all my tales are based on the fundamental premise that common human laws and interests and emotions have no validity or significance in the vast cosmos-at-large. To me there is nothing but puerility in a tale in which the human form--and the local human passions and conditions and standards--are depicted as native to other worlds or other universes. To achieve the essence of real externality, whether of time or space or dimension, one must forget that such things as organic life, good and evil, love and hate, and all such local attributes of a negligible and temporary race called mankind, have any existence at all. Only the human scenes and characters must have human qualities.
This quotation from Lovecraft is from his 1927 resubmission of The Call of Cthulhu to Weird Tales (as reported by S.T. Joshi)
Lovecraft appears to have been more well read in physics and astronomy than in evolutionary sciences. What I believe is an error in his logic is that the evolutionary sciences all point to cooperation dynamics as being fundamental to any kind of social life. To have cooperation you have to at least sometimes solve collective action problems (problems in which there are incentives to defect) and we know of only a narrow range of solutions that actually work for these problems in both the extensive mathematical theory on this subject and in the extensive studies of cooperation across the diversity of species on our planet. Basically you can count the number of fundamental collective action solutions that actually work on one hand. Solutions that don't work are legion. Perhaps this is why stories about "Human laws and interests and emotions" very frequently center of collective action: because it is hard to do, or, there are so many ways to try but fail. Speaking about this topic, in an interview I read probably 15 years ago with Steven Pinker he said it would be accurate to say that the moral rules 'exist' in the universe in a sense. That is, the universe is so structured that the only of intelligent things that exist are things that evolve to be intelligent, and they all therefore must face fundamentally similar collective action problems, and there are only a few solutions to these problems that work. In a talk I saw by space ethicist Kelly Smith (Clemson University) he once stated if he could send one book to extra-terrestrials he would send Kant's work on the categorical imperative, because he thought it encapsulated the moral rules that any social species inevitably would recognize - because they're the only moral rules that actually work.
The vast diversity of the cosmos, as best we currently know, would not mean encounters with extra-terrestrials actually would be transcendent. Perhaps it's disappointing to some, but all our evolutionary evidence suggests that social transcendence, even the evil kind envisioned by Lovecraft, will be found no where in our universe. Everything that is intelligent in our universe has to evolve its intelligence, which means it has to follow the same rules for how replicating intelligent things resolve their collective action problems.
In Chapter 5 of A Manual for Cultural Analysis we discuss how culture can result in emergent treelike patterns at the level of comparisons between groups. These nested patterns arise because many of our most enduring socially learned behaviors change very slowly (that’s a tautology of course, things that change quickly can’t be enduring). Anyway, when things change slowly, and as groups form and dissolve, this tends to result in fairly nested treelike branching patterns for a lot of culture at a very high group-y level.
This can result, however, in spurious findings for correlations between traits that in fact have no functional relationship but instead are both inherited through the same cultural pathways. In the cross-cultural literature this is known as Galton’s problem. See the manual if you want a discussion of this and why it is important.
For an even fuller discussion of Galton’s problem, you can check out my new book chapter. Mine is chapter 8 entitled “Dealing with Culture as Inherited Information.” Hopefully people’s university libraries are picking up copies of this, because as a whole it really is an excellent book. Shoot me an email if you are having trouble finding it.
The data supplement for that book is my code and is at the bottom of the books Wiley page, just scroll all the way down. You can download the supplement for free without buying the book. Go ahead and download it and you have a full R implementation for a variety of methods proposed to solve Galton’s problem. In the code I first show how to fit simple linear regression, which makes no correction at all for Galton’s problem. Then I show how to put in principal components that attempt to fix Galton’s problem, then network autoregression, mixed hierarchical models that use random effects for clumps in the network, and finally phylogenetic regression.
The book chapter shows simulation results that demonstrate multiple of these methods work when diffusion is the cultural process, but only the phylogenetic regression corrects for inheritance as a process. Note: in this case phylogenetic regression also works for diffusion because the network is highly treelike (i.e. nested). This is crucial! Only phylogenetic regression works irrespective of whether the main cultural process is diffusion or inheritance. Since we almost never know the main cultural process a priori, I recommend for treelike networks that we use phylogenetic regression and simply do not interpret a significant role for the phylogeny as necessarily indicative of inheritance. It could indicate diffusion on a treelike network.
FYI, I have a new preferred implementation for the phylogenetic model as compared to when I made the supplement for that book. My new preferred method is phylolm function in the package of the same name. It is much easier to control whether the phylogenetic parameters like lambda are bounded or unbounded in phylolm as compared to fitting the same model with gls. The gls way is what my data supplement code shows. To fit phylolm, you still use ape package to load in the phylogeny. Then give the phylogeny object itself straight to phylolm as a parameter (see the phylolm help file).
One quirk of phylolm is that is does not print BIC in the summary. I’ve advocated for BIC as a way to pick models. So, you can get BIC if you use the AIC function. Suppose my.tree is a fitted phylom model. You type AIC(my.tree,k=log(N)) where N is your sample size. This converts the AIC into the BIC. The principle difference in the two is AIC uses a penalty of 2 all the time, while BIC uses a penalty that is log(N). You can learn about his yourself with the AIC function help page.
OK, so between this blog and my prior one I have provided implementation for 1) determining which network is important for your cultural trait and 2) correcting for Galton’s problem on your network if it is highly treelike. That still leaves a hole in the analytic pipeline if your network is not treelike. What to do then? Don’t worry, I’m on it! I have a set of NIH-funded projects about physician networks, which are highly non-treelike. I have a paper in preparation right now that shows the phylogenetic method predictably fails under this condition to correct Galton’s problem on a messy non-tree network. In fact, all the previous methods fail! So, I’m inventing a couple new methods and hopefully will have that paper submitted soon.
I’m continuing to post implementation notes to accompany A Manual for Cultural Analysis, which I published last year together with two of my anthropologist colleagues at RAND. In my last installment, I provided links and advice for implementing CCA/PCA.
In this blog post I will address how to implement the network modeling method that is discussed in some detail in Chapter 4 of the manual. The central question is this: how do we tell which set of social connections are most important to the transmission of a cultural trait? Note: if a trait doesn’t transmit on some kind of social connection, then it can’t be socially learned, and so by definition it isn’t culture!
OK, but people are connected by ties of friendship, marriage, coworkership, twitter, etc. So how do we decide which of the various ties are most relevant to a diffusing cultural trait? We cover this question in a lot of detail in the manual. I covered it with even more detailed simulations in my paper with my student Rouslan Karimov. We never got a supplement published for that paper, so I’m posting here the code you need to run the most important analysis from the paper – the method that works. First, make sure R is installed. Download the files that are part of this blog post. Extract (unzip) the July12.7z file - I had to zip it to post it here. After it is extracted you should have a file July12.RData. Then if you double click the July12.RData file it should start up R and will already have the simulated data objects you need in the workspace. Type ls() in the R command line to see what is in the workspace. If double clicking doesn’t work, then start R and type load(“diffsimcont.RData”). Make sure you have used the change directory feature in the dropdown to move your active directory to wherever you put July12.RData.
Think of this like a cooking show where some intermediate step is already baked. To create the things in the R workspace you just loaded you would need to simulate networks, simulate trees, then simulate characters diffusing/evolving on them, etc. I’m happy to provide the simulation code to anyone interested. Just email me. The focus of this blog post, however, is not about building simulations but learning to apply dyadic regression with random effects to network/tree datasets.
OK, so then start running my code file called CodeRandEffectsNetworkSims.txt. Simply copying and pasting one line at a time into the R command line is a good way to learn how a piece of R code works. When you get to the loop you would have to paste in the whole loop for it to run; however, I recommend you set i equal to something, like i=1, and then walk through the loop one line at a time as well. That will enable you to inspect what is happening in the loop. One important part is this bit where it defines what you need to run the random effects regression that controls for the repeated identities of the individuals. The individuals are being repeated across each of their network relationships:
Within the lmer function call the terms (1|rows) and (1|cols) are what is specifying the random effects – which are just the identity of each row and column for each dyadic datapoint. I like lmer in the lme4 package for random effects models (aka mixed hierarchical models) in R, but another option is gls in the nlme package. There are more options besides these as well, including in other statistical packages like SAS, which has some very good random effects modeling routines. I’m not going to discuss fully here why this is the best approach to determining which tree or network most governs the cultural diffusion process for a trait – read the manual or Karimov and Matthews 2017 if you want the answer to that.
In terms of getting to know how lmer works, be sure to run some of the example code provided in the lmer help file. From R you can get to the help for any function by typing ?function.name in the R command line. For example, typing ?lmer will get you the lmer help file.
I will say that I think the simulations in Karimov and Matthews 2017 are more comprehensive than anything anyone else has ever done on this issue. We show that the dyadic regression with random effects is a definitive solution. It works for multiple networks, or networks combined with trees. I’m sure one could create evil combinations of unmeasured confounding and measurement error where the method will fail, but in principle it works across all relevant conditions while I show the other commonly used methods like lnam (sna R package) and MRQAP (aka Mantel test) do not work across all relevant conditions. If you can fit a random effects regression model then you can fit the method I’m recommending based on the simulations I’ve done. You don’t need any particular software package, you don’t need my code, just regress the trait distances and network ties, include random effects for node IDs, and you’re done. I shouldn’t hear anymore at conferences about how we can’t distinguish treelike inheritance from network diffusion, or determine which networks are important. Measure whatever networks or trees you think might matter, put them in the dyadic regression with random effects, and you’re done.
Since my colleagues and I published A Manual for Cultural Analysis, some people have asked for R code examples of all the things we describe. That’s a fair critique of the manual as we originally published it, although I’ll note that most of the papers and books we referenced already provide implementations or point to them. Regardless, here on my blog I will write a set of posts that will point to everything you need to implement what’s in the manual.
The first part of the manual focuses on using Cultural Consensus Analysis (CCA) and Principal Component Analysis (PCA) as a first pass at understanding cultural data. Read the manual if you want to understand why this is such an appropriate first pass.
PCA has a large associated literature that I can’t overview here. For implementation, I think the best option is the prcomp function in the ‘stats’ R package that should come with any basic R install. The other option is princomp. I prefer prcomp because it uses SVD rather than eigenvalue decomposition, which is supposed to be slightly more accurate numerically. Also, this implementation allows for data structures that have more variables than datapoints, which is a common occurrence in cultural data.
CCA is a technique from cognitive anthropology, which is a subfield of cultural anthropology. Basically it works by performing PCA on the transpose of the usual individual by variable matrix, thus you are performing PCA on a variable by individual matrix. This procedure results in loadings for the individuals on the components, and scores for the variables, again the reverse of the usual PCA procedure. Exactly why you might do this theoretically and when you might use CCA vs PCA is answered in the manual.
Skipping to implementation, the simplest way to do CCA is to simply use prcomp on the transpose of your data. Like this: prcomp(t(your.data)). The t() is the transpose function in R.
There also are some packages specifically for PCA that can allow you to fit more subtle forms of it, and allow to you ensure the mathematics are being done in more precisely the same way as in prior important articles by folks like Batchelder, Romney, and Handwerker among others (check the manual for refs). One R package option is AnthroTools for R. AnthroTools will implement the classic version of CCA, and it provides some neat data manipulation tools specific to common types of cultural anthropology data, such as free-lists. Another option with more advanced features is CCTpack, which implements both the classic CCA but also more recently developed modifications, such as contexts where there is more than one underlying cultural stance.
That should cover the options, at least in R, for implementing PCA and CCA as we described in A Manual for Cultural Analysis. Note that all R functions have example code that works down at the bottom of the help pages for them. I’ve learned a lot just by running those little examples and comparing the input data they used to the outputs generated by the functions.
Stayed tuned to my personal blog for the next days and weeks because I am going to publish similar posts for the network analysis and phylogenetic analysis chapters. Feel free to leave comments here with questions or email me.
I recently published an article in an applied journal that might be unlikely for anthropology folks to find, so I thought I would add some commentary on it here. The article uses agent-based simulations to assess the statistical performance of several network regression methods. Network regression is a primary way to assess whether cultural diffusion is occurring, and over which social ties the diffusion happens. The social ties are represented by different networks. Although the article focuses on applications to networks of malicious actors, the methods are applicable generally to most human social networks.
The methods I investigated essentially are some of the currently best approaches for implementing tests to determining which of several transmission pathways most govern the spread of a trait. This type of question is core to cultural evolutionary analysis, and in my view core to the original mandate of anthropology. A good example of a running anthropological discussion on this matter would be the analysis of Welsch et al. 1992 and the critiques, counter critiques, and reanalyses that followed. In my view, this entire exchange is some of the best of anthropology.
Of course, post-Boas cultural anthropology tends not to view articles like Welsch et al. 1992 as of foundational interest. That’s because much of American cultural anthropology left behind methods that systematically compared across different sites or cultures in favor of particularist accounts conducted though increasingly humanistic modes of research. Nothing wrong with the latter by the way – I regard it as a completely valid intellectual activity. It’s just not the entirety of what anthropology was founded to do, and the result has left a significant portion of scientific research on culture without a disciplinary home.
Most biologically inclined anthropologists, meanwhile, in recent memory had vigorously adopted the behavioral ecology paradigm (Grafen’s phenotypic gambit) – which is essentially utility maximization theory from microeconomics with reproductive success or its proxies in replacement of money. Studying cultural evolution is difficult to fit into this framework because classical behavioral ecology is intentionally heuristically teleological rather than explicit and mechanical. Behavioral ecologists don’t necessarily think people are intentionally trying to maximize fitness outcomes, although some might, but they think at least that the mechanical operations of the mind result in a being that behaves as if it is maximizing fitness. In the classic approach no attempt is made to understand the causal mechanisms. That’s all fine, it just only gets you so far. Wouldn’t you rather actually know about the mechanisms themselves?
The perhaps unexpected irony is that folks interested in cultural evolutionary models, by articulating proximate mechanisms, arguably are the ones being more reductionist than those sticking within the classical behavioral ecology paradigm. It’s ironic because cultural research has so often been assumed to have something to do with non-reductionist notions. Thornhill and Palmer in their controversial book at one point say people prefer cultural explanations because culture is consistent with the idea of free will – but cultural explanations offered by cultural evolutionists are just as causal as explanations offered by behavioral ecologists. Both explain individuals’ behavior as a mechanistic result of prior events. In fact, I’d argue cultural evolutionists are being more causal in that they usually seek to specify causal mechanisms start to finish, ultimately and proximately, while behavioral ecology in its classical approach leaves proximate mechanisms unspecified. Fincher and Thornhill’s well-known articles on infectious disease stress and cultural traits are good examples – we are left at the end of the papers with only a functional relationship (heuristic teleology) and don’t know if that is mediated by cultural inheritance, individual learning, an innate reaction norm, or genetic variation.
The second irony is that “behavioral economics,” that moves beyond utility maximization and into proximate causes, has become mainstream or nearly so in microeconomics, and microeconomics essentially is where behavioral ecology first coopted its theory.
Which brings me back to social network analysis applied to cultural diffusion. Cultural diffusion as a theorized process really sits between proximate and ultimate causation in the traditional sense because it is 1) a proximate factor in individual’s personal development, 2) has an epidemiological and phylogenetic character to its own as cultural history, and 3) ultimately can’t fully escape biological selection because, at least among any longstanding cultural variants, natural selection will favor directly cultural variants that promote fitness or genes that bias people to prefer such variants. The other thing I like about studying cultural diffusion, inheritance, and related processes is the hypotheses are eminently refutable. I think culture often is important, but whether it is important or not to a particular case is an empirical matter. Folks like me doing this, I think, often find cases in which culture is not the most important process involved. In some of my own research, I’ve proposed that genetic variation accounts for some cross-society differences in personality traits that are usually interpreted as entirely socially learned (Matthews and Butler 2011). My recent paper tries to advance the empiricist and refutable tradition of cultural research by determining which of the currently available methods perform best at discerning cultural diffusion processes.
kFor this blog I want to point to a great perspective piece by some colleagues that also illustrates a peculiar tendency of evolutionary biologists. Evan Maclean and Brian Hare recently wrote a piece for Science about work by Miho Nagasawa and others showing that when we look at dogs it actually taps into the neuro-chemical (oxytocin) pathways used in human bonding. The same does not appear to happen when humans gaze on wolves, which suggests the effect is a specific product of some kind of evolutionary selection with humans and dogs, rather than an incidental byproduct. Here and here are the links to the articles.
The research is all very cool. What I wanted to comment on though from a cultural analysis perspective is the way Evan and Brian phrase their title as ‘Dogs Hijack the Human Bonding Pathway’. What’s with the hijacking? Are dogs up to some nefarious takeover plot at the expense of us humans? Certainly that is what is implied by the natural English reading of the word ‘hijack.’ This reading is in fact emphasized, in my view, because the title is paired with an image of a human looking at a Yorkshire terrier. Certainly that seems to imply the reading that dogs are using us at our expense to their benefit – more so than if the image had paired the human with a more apparently useful dog, like maybe a coonhound or a huskie. Would the title even have made intuitive sense with a black and tan coonhound in the picture?
Of course it can’t be that dogs hijack human bonding in an immediate proximate sense, because hanging out with dogs just feels good. In evolutionary biology we often distinguish between using a word in a proximate or ultimate sense. The difference is timescale. Proximate sense means the word applies to the time scale we experience things in. So, we live, die, mate, and have all sorts of other ecological interactions. Ultimate sense means the word applies to the evolutionary time scale where all the aggregate outcomes of those proximate things add up to result in genetic drift, natural selection, etc. A classic example is you could say a person’s skin tanning in response to sun exposure is an adaptation in a physiological proximate sense, or that its ability to tan is an adaptation in an evolutionary sense (it is an aggregate product of generations of selection for tanning – I understand some other humans have this ability).
So, it can’t be that dogs hijack us in a proximate sense, because that would be like saying US Airways hijacks me each time I fly to DC. A professional pilot took me there quickly, with a cold drink in my hand, and I was actually reasonably comfortable.
So I think the only sense in which hijack could have been meant then was the ultimate sense. That’s where to me the title reflects this tendency in evolutionary biology to always see the glass as morally half empty. Most evolutionary biologists will admit there is a lot of cooperation in the natural world, but they tend to emphasize selfishness at all turns. Some people won’t like me saying that, and certainly some particular evolutionists aren’t like this, but I do think it’s the tendency. And yes, having spent a year or so watching monkeys steal food from each other, get eaten by predators, etc. I’ll agree there is a lot of rank selfishness in the natural world.
However, we shouldn’t let all that become a knee-jerk reaction of negativity, and especially about dogs! Now, no one has the data needed to test whether living with dogs has been a net fitness benefit or cost to humans, but this is a blog, so I’m going to make my anecdotal qualitative case that dogs are clearly a net fitness positive. And if it’s a net fitness positive, then dogs haven’t hijacked us; rather, dog-human coevolution is a case of evolutionary mutualism.
Dogs have done a lot for us over the past 20 thousand or so years since we domesticated them. First, they have helped us hunt. We still use them that way today, but I reckon it was even more fitness beneficial to have a good dog at your side when hunting without firearms. Dogs have served as draft animals and protected our children from predators. Terriers also had an important function of exterminating disease-bearing rodents. It seems to me that one less person getting the plague in your family probably more than made up in Darwinian fitness for whatever table scraps and occasional grooming that it cost you to raise the terrier.
I’m willing to concede that perhaps in today’s modern environment dogs may well be a net fitness loss, but that’s not relevant to an evolutionary statement about hijacking. And by modern I mean really modern, because until just recently dogs still performed the critical evolutionary functions I just described. And now for the anecdote. It so happens that my mother-in-law grew up in rural India. As a teenage girl she had a German Shepherd Dog as a pet. One day her father (my wife’s grandfather) was taking a nap in the heat of the day, and a cobra came over the threshold into his bedroom. I don’t recall from the telling of the story who saw the cobra, but someone did and shouted at Dad I guess and he woke up and stayed on the bed while the cobra was still on the floor between the bed and the door. So, as all the humans were standing about, basically trying to figure out how to get Dad out of the room with the cobra on the floor, the German Shepherd rushed in, and before anyone else had moved, the dog rushed the snake and snapped it in two with it’s jaws. Amazingly, the dog didn’t get bit by the snake and survived the encounter. Needless to say this dog was subsequently of legendary status in their village.
Look folks, that’s awesome. I can’t properly quantify a selection coefficient from that, but clearly saving a Dad with teenage children from a cobra earns Rover a whole lot of kibble in the Darwinian end-game. So, maybe next time the magazine Science will spare the knee-jerk negativity and go with a title that reflects what has likely been a long evolutionary history of mutually beneficial cooperation. I’d suggest something like, ‘awesome super-canines save their owners through the human bonding pathway’, and no I don’t think super-hero capes in the accompanying picture would be out of line.
I’ve worked with medium, large, and occasionally big data. What are big data? One line to demarcate it would be data sets so large in size that they cannot be processed in reasonable time by any computing hardware. Datasets like that force the analyst to take non-traditional approaches to process them.
But most of the hype about big data isn’t about such analytic distinctions. The excitement over big data is that we now have the ability to gather truly large datasets on phenomenon that previously were measurable only in small datasets. So, it used to be that you had to survey people to get their opinions about X, Y, or Z, but now we can gather vast amounts of data on people’s opinions from monitoring the twitterverse or blogosphere.
Gathering large data where previously we only had small data was what excited people about Google’s flu prediction system that was based on the search terms people were using. A recent paper, however, shows convincingly that Google flu predictions seriously overestimated the actual incidence of flu as tracked by the records related to physician office visits for influenza like illness, which is the exact measure Google was trying to predict from search terms.
This commentary by Lazer et al. in the journal Science includes many important reminders of how using big data for statistics doesn’t much change the rules of good statistical practice and inference. I often tell my colleagues the same thing. I consider myself one of the relatively few people in a personal position to judge this, since I have conducted studies on data sets as small as six capuchin monkeys and as large as millions of lines of healthcare claims.
Sometimes big data are a real advantage. In particular;
1) Large datasets make testing model fit easier,
2) You have more data to burn through corrections for multiple testing if you are searching for a model with little theory to guide you,
3) Big data are useful when you are predicting something very, very rare, like most cancers or terrorism.
Any particular usefulness to big data pretty much stops there though. Big data don’t much help you to discover important and general features of social behavior, for example. Folks, if you can’t find an effect of one variable on another in several hundred or a thousand data points – then probably it isn’t an important or general effect.
There is something more going on with the Google flu flub, however, than just missteps in statistical best practice. What the Googlers, Twitterers, and other such conspecifics are used to measuring--and they are great at it--are systems where the outcome of interest is more or less the thing being measured itself.
If you want to know what is trending about your company on Twitter, then that is by definition what people are tweeting about. If you want to direct people to the most popular pages from a given search term (what Google excels at) then you by definition want to measure where people click after using the same search term.
But doing those things are doing math and not doing statistics. Statistics is an offshoot of mathematics that is specific to the task of making predictions about things you haven’t measured from observations of other things you did measure. This axiom applies even to basic statistical inference problems such as finding the average height of a population, which doesn’t proceed by measuring everyone in the population. If you just measure everyone then you have counted, you have used some math by calculating the average, but you didn’t do any statistics. Statistically inferring the mean height of a population would proceed by measuring only a smaller sample of that population, and from that deriving an inference of what the average of the whole population likely is, and some measure of your confidence in that prediction. If you have measured everyone, then you have an observation of the population average, not a prediction of it, and you don’t need a measure of confidence.
That’s why estimating the incidence of flu is from search terms is not like Google’s bread and butter work. The incidence of flu is an out-of-sample prediction from search terms because the flu’s incidence is not itself determined just from what people think and therefore search about it. Surely flu incidence is partly a social construct. For example, a group struck by apprehension over the flu will avoid social contact and thereby slow the flu’s spread. But the flu also has other causes outside our thinking and searching the internet about it. Ambient temperature and humidity affect flu transmission, and mutation of the genetic material of the flu virus is a function of properties mostly not constructed by our own sociality.
My contention is the day-to-day work of Google and many tech companies that use big data is not out-of-sample prediction in this way. Instead, they are able to directly measure the thing of interest to their advertisers because the latter are intrinsically interested in the behavior of people within the self-creating system that is the internet.
I think when Google set out to predict the incidence of flu from search terms, they may not have realized they were stepping outside the realm of measurement of a self-creating system (like internet searches) and stepping into the realm of predicting unobserved phenomenon from measurements of another phenomenon.
This realm is that of statistics, and it is well trod by practicing scientists from many fields. Yes, these travelers of the statistical realm usually have used small data. Some of them travel accompanied by a cartload of models and information-theoretic Bayesian livestock. Others are more modest practitioners but highly adept with a particular tried and true beast of burden, such as linear regression or K-means clustering.
Regardless, Google and others may do well to consult some of these conventional statistical vagabonds the next time they venture into analytics that are truly about predicting things not measured. Knowing the paths through a landscape can be even more important if you are carrying big data with you, which can make the effects of navigational errors all the larger.
There is a new paper out by Jamie Tehrani on the evolution of the Little Red Riding Hood fairy tale that is getting some much-deserved attention.
There are two points about this paper that are just fantastic.
First, Dr. Tehrani applies phylogenetic methods to identify both inheritance and diffusion processes. To accomplish this, he supplements his phylogenetic analysis with some network algorithms (neighbornet) and with detailed ethnographic knowledge of these fairy tale variants. The results are a wonderful illustration of how applying phylogenetic methods does not lock the researcher into the assumption that culture evolves by inheritance rather than by diffusion. Instead, the phylogenetic results actively support a reasonable ethnographic argument that cultural diffusion of the story elements was extensive in China, while the story elements were conserved and thus inherited in Europe. Papers like Dr. Tehrani’s move us well beyond the now sterile debate about whether culture is inherited or diffused (folks, sometimes it’s one, and sometimes it’s the other, and sometimes it’s a mix, so deal with it). I think studies like this one are a clear model for the future growth of quantitative cross-cultural analysis.
I would also point out a prior paper by Dr. Tehrani, myself, and others, that showed how phylogenetic methods could also detect other types of cultural diffusion – specifically when different functionally or socially linked blocks of cultural traits move together from one population to another.
To understand the basic point of this paper, think of the way new languages or religions can be adopted wholesale by a population, whether by choice or conquest. Such events result in cultural diffusion from the viewpoint of the population of people (they adopted a new set of traits) but from the viewpoint of the cultural elements the process is one of inheritance of a ‘cultural core’ because conversion can occur without blending of the elements within the core, i.e. the story elements of a single fairly tale, or the ritual elements of a religious denomination. This is why whole populations can be converted and languages or religions moved about across the globe, and yet the process of change in characteristics of these same languages or religions can still, at least sometimes, evolve through tree like inheritance. Our paper provides a method to detect and model such circumstances.
The second fantastic point about Dr. Tehrani’s paper is it shows the power of phylogenetic and network methods to construct empirically rigorous but quantitative models for global-scale cultural phenomenon. In the past I have often thought of language and religion as the two systems of human culture that are ubiquitous, variable, and ancient, but Dr. Tehrani’s paper makes a strong case that traditions of folklore and fairy tale may also fit these criteria. With the analytical methods we now have available, it is just a matter of motivation and funding for all of us to have quantitative global maps of the lineages (inheritance pathways) and linkages (diffusion pathways) among languages, religions, and folktales. Indeed, it is just such a quantified and global model of the landscape of human culture that I believe was the original, pre-Boasian, mandate of anthropology.
Research out last August by Anders Eriksson and Andrea Manica made news in the press for contradicting claims of neanderthal ancestry in contemporary humans. This research is completely driven by mathematical modeling rather than relying on empirical data, and I think ultimately Neanderthal admixture will be supported. John Hawks has very capably laid out the scientific issues and the reasons why Neanderthal admixture is likely to be supported when all is said and done. What I want to address in this blog are the moral lessons that some have tried to extract from the “out of Africa” explanation of human origins. Considering these past arguments in light of the currently debated genetic evidence for Neanderthal admixture indicates why arguments for the equal treatment of people based on their common origin are dangerous rhetoric passed off as reasoned philosophy.
The history of the argument that we should all treat each other equally because we are all so genetically similar or because we share a common ancestry in Africa dates back at least a dozen years. In an article that appeared in the New York Times, writer Nicholas Wade quoted Harvard biologist Edward O. Wilson as saying “We need to create a new epic based on the origins of humanity” (Wade 2000). Dr. Wilson’s comments came from another article in the Wall Street Journal, in which he indicated that the evolutionary history of Homo sapiens could be a new basis for spiritual values that could replace traditional religion. Mr. Wade’s own commentary from his article was that: “Many of the biologists who are reconstructing the human past certainly believe their work has a value that transcends genetics. Although their lineage trees are based on genetic differences, most of these differences lie in the regions of DNA that do not code for genes and have no effect on the body.” He then quoted Dr. Peter Underhill, a geneticist who studies human origins as saying, "We are all Africans at the Y chromosome level and we are really all brothers."
Isn’t it convenient when scientific knowledge of the way the world is seems to justify how we think the world ought to be? In this case people were arguing from evidence of the way biological variation originated in our species (world is) as a reason for why human behavior should be equitable across racial distinctions (world ought to be). Trouble eventually follows though when people start saying the reason we ought to behave a certain way is because the world is a certain way. As the out of Africa model gained more empirical support, even more scientists wanted to jump on the band wagon because they thought they had found a home-run secular reason to justify the equal treatment across race lines that had always been argued on theistic grounds from the time of the Abraham Lincoln and the abolitionists to Martin Luther King Jr. and the civil rights movement. Searching online can find plenty of comments from anthropologists about how human biological variation is only ‘skin deep’ and we are all very recently diverged – as if racism would be more OK if biological differences went deeper than skin level or they diverged more anciently? By 2010 Richard Dawkins was giving talks to forums for the black community about how “we are all African,” and even selling T-shirts!
It was Christopher diCarlo, however, who laid out the case most explicitly that we all should treat each other well because of the facts of our origins in 2010 in Free Inquiry. Dr. diCarlo does an admirable job of laying out the known science of human evolution. Intriguingly, one of the scientists he covers prominently is Andrea Manica. He summarizes the state of the science with: “We are all African. With these four words, we see a genetic coalescence of the entire human population. We now know that we descended from inhabitants of Africa who began migrating out of Africa around 60,000 years ago. In this way, it is impossible for us to not all be, in some ways, related.” He then continues to draw philosophical lessons from this: “With these four words [we are all African], we see that racism is a human invention. It is a social construct with lingering natural biases—leftover baggage from our mammalian xenophobic tendencies.”
I suppose then the proverbial shoe fell in May 2010 when scientists apparently confirmed that at least all living non-African humans have some Neanderthal ancestry that is not shared by African humans (here I use African in the idiomatic English language meaning rather than the sense of Dr. Dawkin’s linguistic contortion). Yes, the percentage is small. The original Neanderthal genome article put the value at 1-4% Neanderthal genes for non-Africans, but more recent studies indicate that number might rise to 8% summed admixture from Neanderthals and Homo erectus for some of us. So, 8% non-recent African origin is small, but it certainly seems nontrivial. Does that mean Dr. diCarlo now should conclude that racism is less of a ‘human invention’ or that some racism is more functional than ‘leftover baggage’? Should we now start making T-shirts for Africans that say things like “Racially pure, no Neanderthal in here” or the Caucasian version “1-4% Neanderthal and loving it.” If all Dr. Dawkins was doing with his T-shirt was educating the public about science then I suppose these post-neanderthal genome T-shirts are equally valid? I hope he sends me a note when he starts selling them at his online store.
Of course Dr. Dawkins wasn’t just talking about science. He and Dr. diCarlo were trying, poorly, to justify their deeply held ethical belief that equal treatment of people from different human subpopulations is a moral imperative. For hardline atheists like these thinkers, the traditional theistic and metaphysical justifications on which abolition and civil rights were based are off the table. They can’t believe as theists do that we should all treat each other equally because we emulate the God who knows and loves everyone regardless of the particulars of their traits or origins. They don’t buy into the metaphysical claims of many Enlightenment thinkers that people are endowed with inherent rights that do not arise from natural origins. Thus Drs. Dawkins, diCarlo and others predicated moral truth on empirical truth of our natural origins. If they sincerely meant any of what they said, then they have to conclude racial prejudice is now a little more permissible (on the order of at least 1-4% more permissible).
Alternatively, they could admit what I suspect is the case, that they never actually thought these arguments from peoples’ origins being equal were good justifications for people treating each other equally. Admitting that however, would be tantamount to admitting that they don’t have a justification for their moral claims. It would also mean admitting that instead of searching for good justifications for their moral claims, they would rather pass off glib rhetoric as reasons to their audience, apparently confident that their audience wouldn’t see that these are terribly illogical arguments, and therefore dangerous arguments, for equal treatment of human persons.
I suspect the current debate about human origins will land on the conclusion that some living humans exhibit some degree of genetic admixture from Neanderthals. The result of this debate has many important scientific implications, but for those of us who hold to the reasons our culture has always held for equal treatment there are no ethical implications of this research. From the many founding fathers of the United States who objected to slavery at our country’s infancy, to Abraham Lincoln and Martin Luther King Jr., our culture has always used some form of metaphysical argument, and usually a theistic one, to justify that people from different ‘races’ should be treated equally. The theistic justification is a strong one precisely because it does not depend on any of the facts of what our origins, similarities, or differences may be.
Wade, N. 2000. The human family tree: 10 Adams and 18 Eves. The New York Times. May 2, 2000, Tuesday, Late Edition – Final. Section F; Page 1; Column 1.
This is my personal blog. The views expressed on this page are my own. My views should not be taken to represent the views of my mentors, employer, or any person or group other than myself.