First, we have to be clear what this debate is about. The only type of cooperation that matters in this debate is classically called 'evolutionary altruism.' This means a behavior that has a real fitness cost to the actor who performs the behavior. Doing the behavior itself is costly, such that it would be eliminated by natural selection if not for direct fitness benefits acquired by the recipient. However, for any behavior to evolve that benefits only the recipient and not the actor, then there has to be a mechanism by which the recipients of the behavior also tend to be those who perform the behavior as well. Otherwise there will be 'freeriders' who game the system by collecting benefits as recipients, but because they never perform the behavior they will never pay the costs. Essentially all models for how evolutionary altruism can evolve are just mechanisms that effectively cut these freeriders out of being recipients, such that recipients are more likely to be also performers of the behavior. The equation that formalizes the rigor with which the mechanism must exclude freeriders is W.D. Hamilton's famous formulation for inclusive fitness, which describes how the relatedness must exceed the ratio of the cost of performing the behavior over the benefit of receiving it. r > c/b (see Nowak 2006). r measures recent genealogical relatedness, in this case, and is one mechanism to ensure that the recipient has at least probability = r of having the genes that cause the individual to perform the altruistic behavior.
That's the mathematical kernal - now enter decades of semantic nutshells around this. Does altruism evolve because of group selection, or is it because individuals are gaining 'indirect' fitness benefits for themselves, or is it really genes promoting copies of themselves residing in the bodies of other individuals? People like Dr. Dawkins and Dr. Pinker have been consistent and vociferous in their denial of any group selection going on in the evolution of altruism. Here are some thought experiments that are troubling for this assertion:
1. Consider an evolutionarily altruistic behavior that has evolved by kin selection. For this behaivor, individuals recognize siblings and then perform a fitness-costly behavior that benefits the sibling. Siblings have an r=0.5, which satisfies Hamilton's equation for this particular behavior, but assume relatedness lower than 0.4 does not. So far so good. Now, our individual selectionists will point to the benefits being given to relatives and say 'see - indirect benefits to the actors - ergo, no group selection effects.' But what happens when a freerider mutant who never performs the behavior is born into an altruist family? There is no reciprocity in this system, meaning the mutant's siblings just detect that he is their sibling and thus donates the benefits to him. What happens? The freerider mutant always will have a higher fitness than his siblings, because he never pays the cost of the behavior and he receives just as many benefits. If the only thing we need to think about is the relative fitness of individuals, then kin selection cannot evolve altruistic behaviors because freerider mutants always have higher fitness than their altruistic siblings. How can the gene for the altuistic behavior ever prosper? Clearly within their own families, altruists always lose to freeriders.
If you are a scientist whose thinking has been dogmatized against group selection, don't worry, there's hope. Just each morning keep repeating the freerider (saying selfish is more fun) mutant scenario, and it will help to deprogram you. This is what I had to go through to really understand this after years of programming. Just keep saying, but what about the fact that freerider (selfish) mutants always have to have higher fitnesses within their own families?!
The reason Hamilton's equation actually works is because freerider mutants go on to have families that are dominated by sets of freeriders. That's the only way it can work. Once this happens, the payoff to selfishness drops below altruism because now the selfish individuals get none of the benefits of the behavior (none of their siblings perform the behavior). Hamilton himself identified this very clearly in his 1975 book chapter in the volume "Biosocial Anthropology". Try reading it in addition to Hamilton's 1964 (here is the first one of the pair).
David S. Wilson has argued, and I agree, that the best way to think about and phrase this insight is that evolutionary altruism evolves because of across-group fitness differentials. The reason Hamilton's equation works is because a sufficient level of recent genealogical inheritance will create enough across group fitness differential to overcome the invasion of freerider mutants. Should we call this group selection? Seems reasonable to me, but we could call it other things. It differs from traits that we could properly say are selected only at the individual level because the fitness differential that causes the gene to increase in frequency only occurs across some sets of individuals in the population (in my scenario these sets are individuals across different kin groups). Indeed, this is why Hamilton himself suggested in 1975 that it would be clearer to call some inclusive fitness effects as kin-group selection. I like this term a lot actually. We could also make the distinction by talking about selection that is within-group and within-population as different from selection that is within-population only. That might be most pleasing to the likes of Drs. Dawkins and Pinker, as now we would only by implication be saying something happens at a group level. It's completely correct - kin selected evolutionary altruism is not selectively favored within groups of kin. Any claim that it is so selected is demonstrably, devastatingly, false. Hamilton shows this very clearly in his 1975 chapter. Within-group within-population is a bit wordy though, so we would have to make acronyms out of them and distinguish between WGWP and WP forms of selection. Seems cumbersome, but again, we humans make up these semantics so we can call it what ever we like.
2. OK, so after thought experiment 1 a now wavering anti-group-selectionist might be thinking 'well fine, but that scenario didn't change any of the actual predictions of my incorrect individual selection verbiage. So, even if I was right for the wrong reasons, I still got the right predictions, which is what matters in science.' That's quite fair, and I agree wholeheartedly that getting the right predictions is what matters in science. Once we give up on this we become philosophers, and we had those for thousands of years and never figured out anything.
But individual selection thinking does get predictions wrong. One prediction that gets missed every time without thinking about groups is the distinction between an altruistic behavior that by its nature is a public good (perhaps an alarm call to my whole group) from one that relies on kin recognition mechanisms. In the former, there is no point in evolving kin recognition and the r that matters is the average r of the group. The fact that the alarm call warns siblings, for example, has to be discounted by the fact that it warns nonrelated individuals, and the discounting comes out to exactly the mean r across the group that hears the alarm call. This is a serious prediction that in my experience with students and colleagues gets missed all the time and, again, is very well laid out in Hamilton's 1975 paper (and also by the way it is in the 1964 paper). Sure, you can rephrase how kin selection works on public good in inclusive fitness terms if you want, but the correct predictions do not flow naturally from this logic and I think are frequently missed by empirical researchers.
3. Let's continue to where things continue to get worse for anti-group-selection views when we think more about gene-eye-views of the world.
Are genes the unit of selection? I have no doubt they are a unit of inheritance, but selection? The only way they can be a unit of selection is if you adopt a very contorted notion of what a gene is. Let me illustrate. Imagine two villages of humans that each are involved in reciprocal altruism within the village - you scratch my back and I'll scratch yours. So, now the individuals stop giving benefits to freeriders after they discover through their interactions who is a freerider. Let's suppose one village all have the same gene variants (alleles) that cause them to behave with altuistic reciprocity, but the other village is like a UN training camp with people from all over the world. Just by chance, the UN village has different alleles that have different base pair sequences. They all equally produce the same altruistic reciprocity behavior, but they are different sequences, and they have different evolutionary origins. Maybe some of them even differ in functional parts of the allele that contribute to reciprocal altruism, and maybe in fact some people's reciprocity behavior is influence by different sections of DNA that are not even orthologous to the sections that affect another's reciprocity. Even though they create the same phenotype, would anyone really call these the same 'genes'? With their different origins and different sequences can they be said to be the same 'entities' selfishly advancing 'their' own replication? It seems to me they are not, and remember the village with all the same homologous reciprocity inducing alleles with the exact same base pair sequences! Any natural interpretation of the English language would conclude that there is more successfully selfish genic selection for reciprocity happening in the homogenous village. I mean, the UN village is benefiting copies of other genes that aren't even remotely related to each other.
The point is none of this 'gene's eye view' and 'selfishly replicating genes' stuff matters. Evolution, remember, is just a set of mundane mechanistic interactions that stack up to produce algorithmic effects over time. Think of it like a set of billiard balls being hit ever so deterministically on a table, but instead of the balls moving across space the chains of causal collisions are moving through time. It doesn't matter whether these alleles had the same origin, different origin, or even if they are in the same places on chromosomes or code for the same protein products. All that matters is they cause the individuals to do this reciprocal behavior with other reciprocal individuals, so all these diverse genes involved in such a system can rise with the other's tides. I think somewhere Dr. Dawkins tried to redefine individual selfish genes as just this highly abstract entity, such that we would call different nonorthologous stretches of DNA and different nonhomologous alleles all one 'gene' if they were all related to a single phenotype. But no one actually defines 'a gene' this way because it would make doing human genomes, and biochemistry, and most of biology impossible.
After these thought experiments hopefully you have loosened up to see the key to evolutionary altruism is just to find any mechanism where the benefits of altruistic action keep getting to other altruists and freeriders are excluded. Any mechanism that reliably establishes a correlation between being an altruist and receiving benefits from other altruists will do - it doesn't have to achieve this at every grouping level of society, and it doesn't have to do it through homologous genes that selfishly replicate themselves. In fact, it doesn't need genes at all, just inherited stuff. Now the predictions really start to diverge from typical gene and individual selectionist theory. Because if you just need inherited stuff, then evolution can use cultural variants as well to create evolutionarily altruistic adaptations. That's a topic for another blog (here is my recent paper on it), and I hate to say this one more time, but yes, W.D. Hamilton already identified how cultural evolution would work equally well for all this in his 1975 chapter.