# Simple Equations are No Good When the Variables are Meaningless

A few weeks back, I reflected on why mathematical biology can be so hard to learn—much harder, indeed, than the mathematics itself would warrant.

The application of mathematics to biological evolution is rooted, historically, in statistics rather than in dynamics. Consequently, a lot of model-building starts with tools that belong, essentially, to descriptive statistics (e.g., linear regression). This is fine, but then people turn around and discuss those models in language that implies they have constructed a dynamical system. This makes life quite difficult for the student trying to learn the subject by reading papers! The problem is not the algebra, but the assumptions; not the derivations, but the discourse.

Recently, a colleague of mine, Ben Allen, coauthored a paper that clears up one of the more confusing points.

Hamilton’s rule asserts that a trait is favored by natural selection if the benefit to others, $B$, multiplied by relatedness, $R$, exceeds the cost to self, $C$. Specifically, Hamilton’s rule states that the change in average trait value in a population is proportional to $BR – C$. This rule is commonly believed to be a natural law making important predictions in biology, and its influence has spread from evolutionary biology to other fields including the social sciences. Whereas many feel that Hamilton’s rule provides valuable intuition, there is disagreement even among experts as to how the quantities $B$, $R$, and $C$ should be defined for a given system. Here, we investigate a widely endorsed formulation of Hamilton’s rule, which is said to be as general as natural selection itself. We show that, in this formulation, Hamilton’s rule does not make predictions and cannot be tested empirically. It turns out that the parameters $B$ and $C$ depend on the change in average trait value and therefore cannot predict that change. In this formulation, which has been called “exact and general” by its proponents, Hamilton’s rule can “predict” only the data that have already been given.

(PDF)

# Reflecting on Confusion

While I was writing Multiscale Structure in Eco-Evolutionary Dynamics, I found myself having a frustrating time reading through big chunks of the relevant literature. The mathematics in the mathematical biology was easier than a lot of what I’d had to deal with in physics, but the arguments were hard to follow. At times, it was even difficult to tell what was being argued about. A blog post by John Baez, on “biology as information dynamics,” called this frustration back to mind—not because it was unclear itself, but rather because it touched on the source of the fog.

I think the basic cause of the trouble is the following:

The application of mathematics to biological evolution is rooted, historically, in statistics rather than in dynamics. Consequently, a lot of model-building starts with tools that belong, essentially, to descriptive statistics (e.g., linear regression). This is fine, but then people turn around and discuss those models in language that implies they have constructed a dynamical system. This makes life quite difficult for the student trying to learn the subject by reading papers! The problem is not the algebra, but the assumptions. And that always makes for a thorny situation.

# Good News if You’re an Evil Prof, Though

Let’s say you tell your students that arm folding is a genetic trait, with the allele for right forearm on top (R) being dominant to left forearm on top (L). Results from a large number of studies show that about 11 percent of your students will be R children of two L parents; if they understand the genetics lesson correctly, they will think that either they were secretly adopted, or Mom was fooling around and Dad isn’t their biological father. More of your students will reach this conclusion with each bogus genetic trait that you add to the lesson. I don’t think this is a good way to teach genetics.

Via PZ Myers, who is teaching genetics this semester and has an interest in getting it right.

# On “Invention”

When I was a little younger than Ahmed Mohamed is now, I invented the distance formula for Cartesian coordinates. I wanted to make a simulation of bugs that ran around and ate each other. To implement a rule like “when the predator is near the prey, it will chase the prey,” I needed to compute distances between points given their $x$- and $y$-coordinates. I knew BASIC, and I knew the Pythagorean Theorem. However many people had solved that before me, it wasn’t written down in any book that I had, so I took what I knew and figured it out.

Those few pages of PowerBASIC on MS-DOS never amounted to much by themselves, but simulating ecosystems remained an interest of mine. I returned to the general idea now and then as I learned more.

And then, hey, what’s this? It looks like a PhD thesis.

“I bet every great mathematician started by
rediscovering a bunch of ‘well known’ results.”
—Donald Knuth, Surreal Numbers

# Multiscale Structure in Eco-Evolutionary Dynamics

I finally have my thesis in a shape that I feel like sharing. Yes, this took over three months after my committee gave their approval. Blame my desire to explain the background material, and the background to the background….

In a complex system, the individual components are neither so tightly coupled or correlated that they can all be treated as a single unit, nor so uncorrelated that they can be approximated as independent entities. Instead, patterns of interdependency lead to structure at multiple scales of organization. Evolution excels at producing such complex structures. In turn, the existence of these complex interrelationships within a biological system affects the evolutionary dynamics of that system. I present a mathematical formalism for multiscale structure, grounded in information theory, which makes these intuitions quantitative, and I show how dynamics defined in terms of population genetics or evolutionary game theory can lead to multiscale organization. For complex systems, “more is different,” and I address this from several perspectives. Spatial host–consumer models demonstrate the importance of the structures which can arise due to dynamical pattern formation. Evolutionary game theory reveals the novel effects which can result from multiplayer games, nonlinear payoffs and ecological stochasticity. Replicator dynamics in an environment with mesoscale structure relates to generalized conditionalization rules in probability theory.

The idea of natural selection “acting at multiple levels” has been mathematized in a variety of ways, not all of which are equivalent. We will face down the confusion, using the experience developed over the course of this thesis to clarify the situation.

# My Year in Publications

This is, apparently, a time for reflection. What have I been up to?

And so this is Korrasmas
Things have been Done
Kuvira is fallen
A new ‘ship just begun

Kor-ra-sa-mi
We all knew it
Kor-ra-sa-mi
now-ow-ow-owwwwwww

Well, other than watching cartoons?

At the very beginning of 2014, I posted a substantial revision of “Eco-Evolutionary Feedback in Host–Pathogen Spatial Dynamics,” which we first put online in 2011 (late in the lonesome October of my most immemorial year, etc.).

In January, Chris Fuchs and I finished up an edited lecture transcript, “Some Negative Remarks on Operational Approaches to Quantum Theory.” My next posting was a solo effort, “SIC-POVMs and Compatibility among Quantum States,” which made for a pretty good follow-on, and picked up a pleasantly decent number of scites.

Then, we stress-tested the arXiv.

By mid-September, Ben Allen, Yaneer Bar-Yam and I had completed “An Information-Theoretic Formalism for Multiscale Structure in Complex Systems,” a work very long in the cooking.

Finally, I rang in December with “Von Neumann was Not a Quantum Bayesian,” which demonstrates conclusively that I can write 24 pages with 107 references in response to one sentence on Wikipedia.

# It’s Good to Laugh

Alleged intellectual Christina Hoff Sommers (I know, I know, it’s bad form to give away the punchline of a joke so early) recently had this to say:

Dear liberals, When you side with today’s 3rd wave intersectional feminism, you are siding with the intellectual equivalent of creationism.

As a liberal feminist whose day job actually is studying evolutionary dynamics, I can only say this:

# Lacking Tonka

Dawkins claims that Hölldobler has “no truck with group selection”. Wilson and Hölldobler (2005) proposes, in the first sentence of its abstract, that “group selection is the strong binding force in eusocial evolution”. Later, Hölldobler (with Reeve) voiced support for the “trait-group selection and individual selection/inclusive fitness models are interconvertible” attitude. Hölldobler’s book with Wilson, The Superorganism: The Beauty, Elegance, and Strangeness of Insect Societies (2008), maintains this tone. Quoting from page 35:

It is important to keep in mind that mathematical gene-selectionist (inclusive fitness) models can be translated into multilevel selection models and vice versa. As Lee Dugatkin, Kern Reeve, and several others have demonstrated, the underlying mathematics is exactly the same; it merely takes the same cake and cuts it at different angles. Personal and kin components are distinguished in inclusive fitness theory; within-group and between-group components are distinguished in group selection theory. One can travel back and forth between these theories with the point of entry chosen according to the problem being addressed.

This is itself a curtailed perspective, whose validity is restricted to a narrow class of implementations of the “multilevel selection” idea. (Yeah, the terminology in this corner of science is rather confused, which doesn’t make talking about it easier.) Regardless, I cannot think of a way in which this can be construed as having “no truck with group selection”. The statement “method A is no better or worse than method B” is a far cry from “method A is worthless and only method B is genuinely scientific”.

If Dawkins has some personal information to which the published record is not privy, that’s fine, but even if that were the case, his statements could not be taken as a fair telling of the story.

EDIT TO ADD (21 November 2014): I forgot this 2010 solo-author piece by Hölldobler, in a perspective printed in Social Behaviour: Genes, Ecology and Evolution (T. Székely et al., eds). Quoting from page 127:

I was, and continue to be, intrigued by the universal observation that wherever social life in groups evolved on this planet, we encounter (with only a few exceptions) a striking correlation: the more tightly organized within-group cooperation and cohesion, the stronger the between-group discrimination and hostility. Ants, again, are excellent model systems for studying the transition from primitive eusocial systems, characterized by considerable within-group reproductive competition and conflict, and poorly developed reciprocal communication and cooperation, and little or no between-group competition, one one side, to the ultimate superorganisms (such as the gigantic colonies of the Atta leafcutter ants) with little or no within-group conflict, pronounced caste systems, elaborate division of labour, complex reciprocal communication, and intense between-group competition, on the other side (Hölldobler & Wilson 2008 [the book quoted above]).

And, a little while later, on p. 130:

In such advanced eusocial organisations the colony effectively becomes a main target of selection […] Selection therefore optimises caste demography, patterns of division of labour and communication systems at the colony level. For example, colonies that employ the most effective recruitment system to retrieve food, or that exhibit the most powerful colony defence against enemies and predators, will be able to raise the largest number of reproductive females and males each year and thus will have the greatest fitness within the population of colonies.

# Google Scholar Irregularities

Google Scholar is definitely missing citations to my papers.

The cited-by results for “Some Negative Remarks on Operational Approaches to Quantum Theory” [arXiv:1401.7254] on Google Scholar and on INSPIRE are completely nonoverlapping. Google Scholar can tell that “An Information-Theoretic Formalism for Multiscale Structure in Complex Systems” [arXiv:1409.4708] cites “Eco-Evolutionary Feedback in Host–Pathogen Spatial Dynamics” [arXiv:1110.3845] but not that it cites My Struggles with the Block Universe [arXiv:1405.2390]. Meanwhile, the SAO/NASA Astrophysics Data System catches both.

This would be a really petty thing to complain about, if people didn’t seemingly rely on such metrics.

EDIT TO ADD (17 November 2014): Google Scholar also misses that David Mermin cites MSwtBU in his “Why QBism is not the Copenhagen interpretation and what John Bell might have thought of it” [arXiv:1409.2454]. This maybe has something to do with being worse at detecting citations in footnotes than in endnotes.

# Multiscale Structure via Information Theory

We have scienced:

B. Allen, B. C. Stacey and Y. Bar-Yam, “An Information-Theoretic Formalism for Multiscale Structure in Complex Systems” [arXiv:1409.4708].

We develop a general formalism for representing and understanding structure in complex systems. In our view, structure is the totality of relationships among a system’s components, and these relationships can be quantified using information theory. In the interest of flexibility we allow information to be quantified using any function, including Shannon entropy and Kolmogorov complexity, that satisfies certain fundamental axioms. Using these axioms, we formalize the notion of a dependency among components, and show how a system’s structure is revealed in the amount of information assigned to each dependency. We explore quantitative indices that summarize system structure, providing a new formal basis for the complexity profile and introducing a new index, the “marginal utility of information”. Using simple examples, we show how these indices capture intuitive ideas about structure in a quantitative way. Our formalism also sheds light on a longstanding mystery: that the mutual information of three or more variables can be negative. We discuss applications to complex networks, gene regulation, the kinetic theory of fluids and multiscale cybernetic thermodynamics.

There’s much more to do, but for the moment, let this indicate my mood:

# 10 LINKS 20 GOTO 10

My “Worked Physics Homework Problems” book now stands at 372 pages. If you ever wonder what I do instead of meeting people.
Continue reading 10 LINKS 20 GOTO 10