Category Archives: Evolution

Two Recent Items Concerning Wikipedia

A few years ago, I found a sentence in a Wikipedia page that irritated me so much, I wrote a 25-page article about it. Eventually, I got that article published in the Philosophical Transactions of the Royal Society. On account of all this, friends and colleagues sometimes send me news about Wikipedia, or point me to strange things they’ve found there. A couple such items have recently led me to Have Thoughts, which I share below.

This op-ed on the incomprehensibility of Wikipedia science articles puts a finger on a real problem, but its attempt at explanation assumes malice rather than incompetence. Yes, Virginia, the science and mathematics articles are often baffling and opaque. The Vice essay argues that the writers of Wikipedia’s science articles use the incomprehensibility of their prose as a shield to keep out the riffraff and maintain the “elite” status of their subject. I don’t buy it. In my opinion, this hypothesis does not account for the intrinsic difficulty of explaining science, nor for the incentive structures at work. Wikipedia pages grow by bricolage, small pieces of cruft accumulating over time. “Oh, this thing says [citation needed]. I’ll go find a citation to fill it in, while my coffee is brewing.” This is not conducive to clean pedagogy, or to a smooth transition from general-audience to specialist interest.

Have no doubt that a great many scientists are terrible at communication, but we can also imagine a world in which Wikipedia would attract the scientists that actually are good at communication.

There’s communication, and then there’s communication. (We scientists usually get formal training in neither.) I know quite a few scientists who are good at outreach. They work hard at it, because they believe it matters and they know that’s what it takes. Almost none of them have ever mentioned editing Wikipedia (even the one who used his science blog in his tenure portfolio). Thanks to the pressures of academia, the calculation always favors a mode of outreach where it’s easier to point to what you did, so you can get appropriate credit for it.

Thus, there might be a momentary impulse to make small-scale improvements, but there’s almost no incentive to effect changes that are structured on a larger scale — paragraphs, sections, organization among articles. This is a good incentive system for filling articles with technical minutiae, like jelly babies into a bag, but it’s not a way to plan a curriculum.

The piece in Vice says of a certain physics article,

I have no idea who the article exists for because I’m not sure that person actually exists: someone with enough knowledge to comprehend dense physics formulations that doesn’t also already understand the electroweak interaction or that doesn’t already have, like, access to a textbook about it.

You’d be surprised. It’s fairly common to remember the broad strokes of a subject but need a reference for the fiddly little details.

Writers don’t just dip in, produce some Wikipedia copy, and bounce.

I’m pretty sure this is … actually not borne out by the data? Like, many contributors just add little bits when they are strongly motivated, while the smaller active core of persistent editors clean up the content, get involved in article-improvement drives, wrangle behind the scenes, etc.

[EDIT TO ADD (24 November): To say it another way, both the distribution of edits per article and edits per editor are “fat tailed, which implies that even editors and articles with small numbers of edits should not be neglected.” Furthermore, most edits do not change an article’s length, or change it by only a small amount. The seeming tendency for “fewer editors gaining an ever more dominant role” is a real concern, but I doubt the opacity of technical articles is itself a tool of oligarchy. Indeed, I suspect that other factors contribute to the “core editor” group becoming more insular, one being the ease with which policies originally devised for good reasons can be weaponized.]

If you want “elitism,” you shouldn’t look in the technical prose on the project’s front end. Instead, you should go into the backroom. From what I’ve seen and heard, it’s very easy to run afoul of an editor who wants to lord over their tiny domain, and who will sling around policies and abbreviations and local jargon to get their way. Any transgression, or perceived transgression, is an excuse to revert.

Just take a look at “WP:PROF” — the “notability guideline” for evaluating whether a scholar merits a Wikipedia page. It’s almost 3500 words, laying out criteria and then expounding upon their curlicues. And if you create an article and someone else decides it should be deleted, you had better be familiar with the Guide to deletion (roughly 6700 words), which overlaps with the Deletion process documentation (another 4700 words). More than enough regulations for anyone to petulantly sling around until they get their way!

And on the subject of deletion, over on Mastodon the other day I got into a chat about the story of Günter Bechly, a paleontologist who went creationist and whose Wikipedia page was recently toasted. The incident was described by Haaretz thusly:

If Bechly’s article was originally introduced due to his scientific work, it was deleted due to his having become a poster child for the creationist movement.

I strongly suspect that it would have been deleted if it had been brought to anyone’s attention for any other reason, even if Bechly hadn’t gone creationist. His scientific work just doesn’t add up to what Wikipedia considers “notability,” the standard codified by the WP:PROF rulebook mentioned above. Nor were there adequate sources to write about his career in Wikipedia’s regulation flat, footnoted way. The project is clearly willing to have articles on creationists, if the claims in them can be sourced to their standards of propriety: Just look at their category of creationists! Bechly’s problem was that he was only mentioned in passing or written up in niche sources that were deemed unreliable.

If you poke around that deletion discussion for Bechly’s page, you’ll find it links to a rolling list of such discussions for “Academics and educators,” many of whom seem to be using Wikipedia as a LinkedIn substitute. It’s a mundane occurrence for the project.

And another thing about the Haaretz article. It mentions sockpuppets arriving to speak up in support of keeping Bechly’s page:

These one-time editors’ lack of experience became clear when they began voting in favor of keeping the article on Wikipedia – a practice not employed in the English version of Wikipedia since 2016, when editors voted to exchange the way articles are deleted for a process of consensus-based decision through discussion.

Uh, that’s been the rule since 2005 at least. Not the most impressive example of Journalisming.

Simple Equations are No Good When the Variables are Meaningless

A few weeks back, I reflected on why mathematical biology can be so hard to learn—much harder, indeed, than the mathematics itself would warrant.

The application of mathematics to biological evolution is rooted, historically, in statistics rather than in dynamics. Consequently, a lot of model-building starts with tools that belong, essentially, to descriptive statistics (e.g., linear regression). This is fine, but then people turn around and discuss those models in language that implies they have constructed a dynamical system. This makes life quite difficult for the student trying to learn the subject by reading papers! The problem is not the algebra, but the assumptions; not the derivations, but the discourse.

Recently, a colleague of mine, Ben Allen, coauthored a paper that clears up one of the more confusing points.

Hamilton’s rule asserts that a trait is favored by natural selection if the benefit to others, $B$, multiplied by relatedness, $R$, exceeds the cost to self, $C$. Specifically, Hamilton’s rule states that the change in average trait value in a population is proportional to $BR – C$. This rule is commonly believed to be a natural law making important predictions in biology, and its influence has spread from evolutionary biology to other fields including the social sciences. Whereas many feel that Hamilton’s rule provides valuable intuition, there is disagreement even among experts as to how the quantities $B$, $R$, and $C$ should be defined for a given system. Here, we investigate a widely endorsed formulation of Hamilton’s rule, which is said to be as general as natural selection itself. We show that, in this formulation, Hamilton’s rule does not make predictions and cannot be tested empirically. It turns out that the parameters $B$ and $C$ depend on the change in average trait value and therefore cannot predict that change. In this formulation, which has been called “exact and general” by its proponents, Hamilton’s rule can “predict” only the data that have already been given.

(PDF)

Reflecting on Confusion

While I was writing Multiscale Structure in Eco-Evolutionary Dynamics, I found myself having a frustrating time reading through big chunks of the relevant literature. The mathematics in the mathematical biology was easier than a lot of what I’d had to deal with in physics, but the arguments were hard to follow. At times, it was even difficult to tell what was being argued about. A blog post by John Baez, on “biology as information dynamics,” called this frustration back to mind—not because it was unclear itself, but rather because it touched on the source of the fog.

I think the basic cause of the trouble is the following:

The application of mathematics to biological evolution is rooted, historically, in statistics rather than in dynamics. Consequently, a lot of model-building starts with tools that belong, essentially, to descriptive statistics (e.g., linear regression). This is fine, but then people turn around and discuss those models in language that implies they have constructed a dynamical system. This makes life quite difficult for the student trying to learn the subject by reading papers! The problem is not the algebra, but the assumptions. And that always makes for a thorny situation.

Good News if You’re an Evil Prof, Though

This is entertaining:

Let’s say you tell your students that arm folding is a genetic trait, with the allele for right forearm on top (R) being dominant to left forearm on top (L). Results from a large number of studies show that about 11 percent of your students will be R children of two L parents; if they understand the genetics lesson correctly, they will think that either they were secretly adopted, or Mom was fooling around and Dad isn’t their biological father. More of your students will reach this conclusion with each bogus genetic trait that you add to the lesson. I don’t think this is a good way to teach genetics.

Via PZ Myers, who is teaching genetics this semester and has an interest in getting it right.

On “Invention”

When I was a little younger than Ahmed Mohamed is now, I invented the distance formula for Cartesian coordinates. I wanted to make a simulation of bugs that ran around and ate each other. To implement a rule like “when the predator is near the prey, it will chase the prey,” I needed to compute distances between points given their $x$- and $y$-coordinates. I knew BASIC, and I knew the Pythagorean Theorem. However many people had solved that before me, it wasn’t written down in any book that I had, so I took what I knew and figured it out.

Those few pages of PowerBASIC on MS-DOS never amounted to much by themselves, but simulating ecosystems remained an interest of mine. I returned to the general idea now and then as I learned more.

And then, hey, what’s this? It looks like a PhD thesis.

“I bet every great mathematician started by
rediscovering a bunch of ‘well known’ results.”
—Donald Knuth, Surreal Numbers

Multiscale Structure in Eco-Evolutionary Dynamics

I finally have my thesis in a shape that I feel like sharing. Yes, this took over three months after my committee gave their approval. Blame my desire to explain the background material, and the background to the background….

In a complex system, the individual components are neither so tightly coupled or correlated that they can all be treated as a single unit, nor so uncorrelated that they can be approximated as independent entities. Instead, patterns of interdependency lead to structure at multiple scales of organization. Evolution excels at producing such complex structures. In turn, the existence of these complex interrelationships within a biological system affects the evolutionary dynamics of that system. I present a mathematical formalism for multiscale structure, grounded in information theory, which makes these intuitions quantitative, and I show how dynamics defined in terms of population genetics or evolutionary game theory can lead to multiscale organization. For complex systems, “more is different,” and I address this from several perspectives. Spatial host–consumer models demonstrate the importance of the structures which can arise due to dynamical pattern formation. Evolutionary game theory reveals the novel effects which can result from multiplayer games, nonlinear payoffs and ecological stochasticity. Replicator dynamics in an environment with mesoscale structure relates to generalized conditionalization rules in probability theory.

The idea of natural selection “acting at multiple levels” has been mathematized in a variety of ways, not all of which are equivalent. We will face down the confusion, using the experience developed over the course of this thesis to clarify the situation.

(PDF, arXiv:1509.02958)

My Year in Publications

This is, apparently, a time for reflection. What have I been up to?

And so this is Korrasmas
Things have been Done
Kuvira is fallen
A new ‘ship just begun

Kor-ra-sa-mi
We all knew it
Kor-ra-sa-mi
now-ow-ow-owwwwwww

Well, other than watching cartoons?

At the very beginning of 2014, I posted a substantial revision of “Eco-Evolutionary Feedback in Host–Pathogen Spatial Dynamics,” which we first put online in 2011 (late in the lonesome October of my most immemorial year, etc.).

In January, Chris Fuchs and I finished up an edited lecture transcript, “Some Negative Remarks on Operational Approaches to Quantum Theory.” My next posting was a solo effort, “SIC-POVMs and Compatibility among Quantum States,” which made for a pretty good follow-on, and picked up a pleasantly decent number of scites.

Then, we stress-tested the arXiv.

By mid-September, Ben Allen, Yaneer Bar-Yam and I had completed “An Information-Theoretic Formalism for Multiscale Structure in Complex Systems,” a work very long in the cooking.

Finally, I rang in December with “Von Neumann was Not a Quantum Bayesian,” which demonstrates conclusively that I can write 24 pages with 107 references in response to one sentence on Wikipedia.

Lacking Tonka

Dawkins claims that Hölldobler has “no truck with group selection”. Wilson and Hölldobler (2005) proposes, in the first sentence of its abstract, that “group selection is the strong binding force in eusocial evolution”. Later, Hölldobler (with Reeve) voiced support for the “trait-group selection and individual selection/inclusive fitness models are interconvertible” attitude. Hölldobler’s book with Wilson, The Superorganism: The Beauty, Elegance, and Strangeness of Insect Societies (2008), maintains this tone. Quoting from page 35:

It is important to keep in mind that mathematical gene-selectionist (inclusive fitness) models can be translated into multilevel selection models and vice versa. As Lee Dugatkin, Kern Reeve, and several others have demonstrated, the underlying mathematics is exactly the same; it merely takes the same cake and cuts it at different angles. Personal and kin components are distinguished in inclusive fitness theory; within-group and between-group components are distinguished in group selection theory. One can travel back and forth between these theories with the point of entry chosen according to the problem being addressed.

This is itself a curtailed perspective, whose validity is restricted to a narrow class of implementations of the “multilevel selection” idea. (Yeah, the terminology in this corner of science is rather confused, which doesn’t make talking about it easier.) Regardless, I cannot think of a way in which this can be construed as having “no truck with group selection”. The statement “method A is no better or worse than method B” is a far cry from “method A is worthless and only method B is genuinely scientific”.

If Dawkins has some personal information to which the published record is not privy, that’s fine, but even if that were the case, his statements could not be taken as a fair telling of the story.

EDIT TO ADD (21 November 2014): I forgot this 2010 solo-author piece by Hölldobler, in a perspective printed in Social Behaviour: Genes, Ecology and Evolution (T. Székely et al., eds). Quoting from page 127:

I was, and continue to be, intrigued by the universal observation that wherever social life in groups evolved on this planet, we encounter (with only a few exceptions) a striking correlation: the more tightly organized within-group cooperation and cohesion, the stronger the between-group discrimination and hostility. Ants, again, are excellent model systems for studying the transition from primitive eusocial systems, characterized by considerable within-group reproductive competition and conflict, and poorly developed reciprocal communication and cooperation, and little or no between-group competition, one one side, to the ultimate superorganisms (such as the gigantic colonies of the Atta leafcutter ants) with little or no within-group conflict, pronounced caste systems, elaborate division of labour, complex reciprocal communication, and intense between-group competition, on the other side (Hölldobler & Wilson 2008 [the book quoted above]).

And, a little while later, on p. 130:

In such advanced eusocial organisations the colony effectively becomes a main target of selection […] Selection therefore optimises caste demography, patterns of division of labour and communication systems at the colony level. For example, colonies that employ the most effective recruitment system to retrieve food, or that exhibit the most powerful colony defence against enemies and predators, will be able to raise the largest number of reproductive females and males each year and thus will have the greatest fitness within the population of colonies.

Google Scholar Irregularities

Google Scholar is definitely missing citations to my papers.

The cited-by results for “Some Negative Remarks on Operational Approaches to Quantum Theory” [arXiv:1401.7254] on Google Scholar and on INSPIRE are completely nonoverlapping. Google Scholar can tell that “An Information-Theoretic Formalism for Multiscale Structure in Complex Systems” [arXiv:1409.4708] cites “Eco-Evolutionary Feedback in Host–Pathogen Spatial Dynamics” [arXiv:1110.3845] but not that it cites My Struggles with the Block Universe [arXiv:1405.2390]. Meanwhile, the SAO/NASA Astrophysics Data System catches both.

This would be a really petty thing to complain about, if people didn’t seemingly rely on such metrics.

EDIT TO ADD (17 November 2014): Google Scholar also misses that David Mermin cites MSwtBU in his “Why QBism is not the Copenhagen interpretation and what John Bell might have thought of it” [arXiv:1409.2454]. This maybe has something to do with being worse at detecting citations in footnotes than in endnotes.

Multiscale Structure via Information Theory

We have scienced:

B. Allen, B. C. Stacey and Y. Bar-Yam, “An Information-Theoretic Formalism for Multiscale Structure in Complex Systems” [arXiv:1409.4708].

We develop a general formalism for representing and understanding structure in complex systems. In our view, structure is the totality of relationships among a system’s components, and these relationships can be quantified using information theory. In the interest of flexibility we allow information to be quantified using any function, including Shannon entropy and Kolmogorov complexity, that satisfies certain fundamental axioms. Using these axioms, we formalize the notion of a dependency among components, and show how a system’s structure is revealed in the amount of information assigned to each dependency. We explore quantitative indices that summarize system structure, providing a new formal basis for the complexity profile and introducing a new index, the “marginal utility of information”. Using simple examples, we show how these indices capture intuitive ideas about structure in a quantitative way. Our formalism also sheds light on a longstanding mystery: that the mutual information of three or more variables can be negative. We discuss applications to complex networks, gene regulation, the kinetic theory of fluids and multiscale cybernetic thermodynamics.

There’s much more to do, but for the moment, let this indicate my mood: