Of Two Time Indices

In the appendix to a paper I am currently co-authoring, I recently wrote the following within a parenthetical excursus:

When talking of dynamical systems, our probability assignments really carry two time indices: one for the time our betting odds are chosen, and the other for the time the bet concerns.

A parenthesis in an appendix is already a pretty superfluous thing. Treating this as the jumping-off point for further discussion merits the degree of obscurity which only a lengthy post on a low-traffic blog can afford.


Quantum mechanics provides statistical predictions for the results of measurements performed on physical systems that have been prepared in specified ways … I hope that everyone agrees at least with that statement. The only question here is whether there is more than that to say about quantum mechanics.
Asher Peres

In this note, I shall take a rather strictly ascetic view of quantum physics. I’ll make the lifestyle choice that “quantum states” are encodings of probability assignments for the possible outcomes of as-yet unperformed experiments. Nothing more, but certainly nothing less. The image which floats hazily to mind is of the great unanalyzed diversity of the world teeming on, and when pieces of it come together, novelty happens: there takes place an act of creative generation which belongs to neither participant alone. Quantum theory is, following this lifestyle, a means of coping with and possibly even living well within a world having such a character. It studies the special case of novelty-generating interactions in which one participant is a scientific agent, a complex system capable of sustaining beliefs and entertaining them with varied degrees of fervency.

This leads me to another touchy question of lifestyle choice: personalist probability theory. The lifestyle starts with the idea of an agent which can believe things, and to these beliefs—which are arrayed in well-defined sets—we say the agent can attach a quantitative expression of its fervency in that direction. We impose the normative rule that the agent’s measures of credence be consistent with one another, and we make the dull matter of consistency more entertaining with stories about Dutch bookies or Ferengi bartenders. We say, and I think this is essentially a historical convention, that higher numbers should mean a stronger belief, even though we could just as well say that an agent writing a larger number means that the agent will be more surprised if that event turns out to happen. (Thanks to Claude Shannon, we know that the one convention is just the logarithm of the other.) Even the use of real numbers for degrees of fervency is, to my eye, a convention: if somebody wants to record credences using Conway’s surreal numbers, or with elements from some Lie group, all I can say is “peace be with you in your quest”. The normative standard of consistency will still yield constraints among credence assignments, the difference being that those credences won’t live in the same set as relative frequencies do. And, of course, there’s no guarantee that the exercise will lead to any useful novel structures.

The theory of subjective event weights built up in this way just is, in the same way that Euclidean geometry just is. It provides an imitation of Platonism in the way that all abstract constructions built up from axioms do. But we want to deal with the natural world our species evolved within! Ferengi-book coherence tells us that subjective event weights obey the axioms of “probability”. The natural question is, in those places where we scientists make use of “probability” talk, can we handle those tasks with SEWs? The claim of the Bayesian-statistics practitioner is that the answer is “yes”—and in those cases where it isn’t, the invocation of “probability” is what’s illegitimate, rather than the theory of SEWs.

In many circumstances of interest, quantum theory can be re-expressed solely in terms of SEWs. Though quantum theory is often (and validly) thought of as a generalization of ordinary probability theory to encompass a wider bestiary of mathematical structures, it can also be treated as a specialization of probability theory.

In quantum physics, we take what we think we know about a system, roll it into a density operator $\rho$, and use that density operator to make statistical predictions about what the system might do in particular experiments. But presenting that information as a matrix operator is not always the most illuminating choice. We can actually rewrite any finite-dimensional density matrix as a probability distribution, using the idea of informationally complete measurements. These are generalized measurement procedures (positive operator valued measures, or POVMs) which have an appealing ability: given a probability distribution over the possible outcomes of an informationally complete POVM, we can compute all the statistics which we could have gotten using the density matrix. Such POVMs can be constructed in any finite-dimensional Hilbert space. The nicest variety are the symmetric informationally complete POVMs, known familiarly as SICs, which are known to exist for many values of Hilbert-space dimension and are suspected to exist for the others. With these tools in hand, quantum theory becomes probability plus extra conditions: the bare bones of “SEW theory” are dressed with sinews whose anatomy depends on our doing quantum physics instead of some other theory.

With all that as prologue, then:

I have occasionally bumped into people who seemingly want to interpret all scientific discovery as Bayesian conditioning. I think it was Howard Barnum who said something about seeing science in “broadly Bayesian” terms, but judging from the “cocktail talk” and how people act in some corners of the Internet, not everyone would grant that “broadly”. New experiences always being weighed against our preconceived mesh of ideas? Yes. Holding to different ideas with varying degrees of tenacity? Yes. The clockwork ratcheting up and down of numerical fervencies defined over a “distinct number of consequences”? I doubt it. Even if such a story could be cooked up, involving a stupendously baroque and artificial sample space, what use would it have?

(A better term than “broadly Bayesian” might be “Darwinian”, or better yet just “evolutionary”. It’s been observed a few times that Bayesian updating is formally analogous to a formula in evolutionary theory called the discrete-time replicator equation. Prior probabilities map onto abundances of alleles in a gene pool, and the weight of new evidence maps onto biological fitness. The probability distribution after updating corresponds to the new gene pool composition after natural selection has operated. (Marc Harper describes it more fully in arXiv:0911.1763.) But the scenario modeled by the discrete-time replicator equation is just a tiny part of evolutionary phenomena. It’s even a tiny part of the mathematics developed to date for dealing with biological evolution. Other evolutionary processes could define other modes of inference whose mapping to Bayesian updating is contrived and awkward at best. For a relatively mundane example, I think Jeffrey conditioning is analogous to a discrete-time replicator equation with a mutation effect added.)

What this might mean for quantum theory is the following: were one enamored of a different physics-neutral mode of inference, the Born rule would be an empirical addition to that mathematics, phrased in its terms. (And, I’d guess, probably harder to translate into the vernacular of physics.) There is your realism for you: the extra addition to coherence due to the quantum character of the world must still be there. We’d just be writing down and trying to motivate a different equation for it.


The Schrödinger equation in QM and the Liouville equation in classical mechanics are, I think, fundamentally synchronic statements about probability assignments. If I’m willing to gamble that a pendulum has position within some range $\Delta q$ and momentum within some interval $\Delta p$, and if I accept the mechanics lessons I had as a young’un, then my hands are forced: I must price lottery tickets referring to the pendulum at other times in a particular way. Each probability assignment concerning a dynamical system carries two time indices: one for the time the agent makes it, and one for the time of the event or proposition written on the ticket. We could write the former with a subscript and the latter in parentheses, for example:

$ \rho_\tau(q,p,t) $

would be a Liouville density for a one-particle system, a commitment made at time $\tau$ about the world’s affairs at time $t$. The Liouville equation connects $\rho_\tau(q,p,t_1)$ to $\rho_\tau(q,p,t_2)$. The subscripts are the same; it’s a synchronic statement. If I get new information at time $\tau_2$, then I can update the whole joint probability density for all times by conditioning on that new information.

If I keep gaining new information, say at times $\tau_i$, then the Shannon entropy of my Liouville density for the present time will keep decreasing. By evolving my Liouville density at time $\tau_i$ forwards and backwards, I can refine my distributions for what the pendulum will be doing in the future and what it had been doing in the past. Entropy decreasing over time? Yegads! We must be in contradiction with thermodynamics!

. . . except that by that reasoning, the thermodynamic entropy of any classical system we can characterize exactly must be zero, because the Shannon index of its Liouville density is nil! Indeed, the thermodynamic entropy of any simulated system must be exactly zero, because the computer doing the simulation knows where everything is and where all the pieces are going at all times!

Yeah, that’s pretty silly.

What it does mean is that I can extract energy more and more efficiently, and that I can make better and better predictions of how somebody else’s energy-extraction experiment will fare. (And a typical such experiment might well involve timescales significantly longer than those of the oscillations and vibrations within my system itself, so what matters isn’t where it can be in phase space at one particular time, but rather the possible variety in its trajectories over an interval of time—what I think in Jaynesian language is called its “caliber”.)

The way statistical-physics students are taught to relate Shannon entropy with thermodynamic entropy is through the “fundamental assumption of statistical mechanics”: we’re told to assign equal a priori probabilities to all points in phase space which have the same energy. From this starting point, one makes computations until relationships which look like the phenomenological equations of thermodynamics come out. But the starting point is one which should make a personalist Bayesian’s skin crawl! Who mandates a prior probability, and who died and made them king?

It’s only reasonable at all because of an even-more fundamental assumption: that we obtained all our information from highly coarse-grained measurements which could only access aggregate properties like the total energy. If we could make finer measurements in the first place, then we’d have no warrant to assign equal a priori probabilities across the constant-energy surfaces, and we’d have to rethink the “Shannon to thermodynamics” connection in a different way.

It [the Second Law of Thermodynamics] is not a law that dictates how things go by themselves, but rather how they go in response to particular experimental investigations.
Campisi and Hänggi (2011)


In order to have any scientific weight, retrodictions have to yield predictions. Feynman has a bit in The Character of Physical Law where he explains how geologists “talk about the past by talking about the future”. If you dig in the ground where nobody has dug before—if you perform an as-yet-unperformed experiment—you’ll find fossil bones of the predicted kind. If a statement about the Earth’s geological past can’t be made to yield predictions about the consequences of new digging, we’re not talking about dinosaurs anymore— we’re talking about the invisible dragons which live in my garage. It’s fine to write a Bayesian probability $p(v)$ for the speed of the Chicxulub impactor, but if we can’t turn that into expectations for new investigations, why should anyone care?

So, what if we do commit ourselves to the idea that quantum uncertainty is uncertainty about future experiment outcomes? That measurements are not just disturbing, but generative? Then we must conclude that retrodictions are just a nostalgic kind of prediction, statements made thinking of the past which must concern the future to have any scientific content. If we disagree about things which happened in the past and stayed there—so what?! That’s like my housemate and I disagreeing today about the number of eggs laid yesterday by the invisible dragon living in our garage. Possible agreement in the future about experiments which can be done in the future—that’s the key.

Among other things, this affects, I believe, the criteria one uses to judge the compatibility of quantum probability assignments. I suspect that accepting agent experiences as the things beliefs are about changes significantly how important one feels various possible kinds of disagreement are. It appears possible in the quantum world that two physicists can be in sufficient discord that, as they perform experiments, their novel experiences bring them into agreement about the future but not about the past. More specifically, suppose Alice and her friend Bob are interested in a quantum system and each plan to receive word of an experiment performed on it. Prior to the experiment, Alice and Bob each make an assignment of probabilities to the possible experiment outcomes, in the form of a density matrix. When the measurement experiment intervenes on the system and coughs up a result, Alice and Bob update their “catalogues of knowledge” accordingly. It can transpire that the two physicists’ post-experiment density matrices agree, but the conditional density matrices which encode their beliefs about the past are incompatible. A classical analogue of this situation would be the following: suppose that Alice and Bob disagree about which side of a die is up. Alice says it’s a 6, Bob says it’s a 1. The die is rolled in a new experiment and comes up 6. Alice and her friend can come to agree about which side is up after the roll, but the new result changes nothing about their earlier disagreement.

I do not know if this is a serious inconvenience to doing science, or to the ascetic view that quantum states are catalogues of probabilities for possible future experimental outcomes. After all, if we hold fast to that position, then a post-measurement conditional density operator for the past spacetime region must be a probability catalogue for counterfactual experiments—interventions into nature which could have been done, but weren’t. Incompatible conditional density operators for spacetime regions in the past are, in this view, arguments over yesterday’s tomorrow, friction without heat. They’re what you get when “measurements” in your world are, not just disturbances, but acts of generation.

(Leifer and Spekkens write that “because nontrivial quantum measurements always entail a disturbance […] coming to agreement about the state of the region after the measurement does not resolve a disagreement about the state of the region before the measurement.” To which a justifiable response might be, “Yes, and isn’t it wonderful?” More importantly, perhaps, I don’t like the phrasing of “a disagreement about the state of the region before the measurement.” The choice and arrangement of words seem wrong. I don’t know if it’s a matter of accident or of design towards a goal I disagree with. Doesn’t “disagreement about the state of the region” sound too much like, say, “disagreement about the calorie content of the region”—isn’t it just the phrasing one would choose if one believed that “the state” were a property of “the region”? This feels to me like bad language for any psi-epistemist, radically QBic or otherwise. “The states ascribed by two agents disagree” would be a more forceful and less muddling statement than “Two agents disagree about the state,” I think.)

A new slogan: Confusion about the past is the price we pay for a world of genuine novelty. The inability of Alice and her friend to come into agreement about retrodictions into the past beyond a measurement is, in microcosm, our inability to agree about what happened before the Big Bang.

(For a discussion of retrodictions-as-predictions in a cosmological context, see Schriffen and Wald (2012), section VI. I also think one could productively disagree with Schriffen and Wald’s discussion of thermal equilibration, at the beginning of Section III. To my eye, invocations of “ergodicity” and “mixing” do not resolve the problem of assigning probability distributions over microstates, but rather defer it. (Which is, to be sure, nonnegligible progress from a physicist’s perspective.) They speak of sampling a system at a “random time” during its dynamical time-evolution, which naturally provokes the question: what do you mean by “random time”? You still need some notion of what probability means to give the whole structure of concepts any content. It’s like the old circular, or rather downward-spiralling, conversation between a Bayesian and a frequentist: “If the probability of the coin landing heads is $p$, does that mean there will be exactly $1000p$ heads in 1000 flips?” “No, the number of heads will likely be close to $1000p$, but not exactly.” “What do you mean, likely? That’s the idea we’re trying to define!” And so forth. You could use large deviation theory to write a formula for how the probability of a deviation falls off with the entropy, but you still need to define probability eventually. That is, you’ve deferred the question—in a sophisticated, quantitative, maybe even useful way!—but not answered it.)


The time-reverse of historical science is, in a sense, the issue of “Boltzmann Brains“—you know, complex structures arising from quantum vacuum fluctuations in the distant future of the universe. Supposedly, there should be stupendously more of them in the long (long, long) run than there are beings like we think we are, and from this the cosmologists deduce all sorts of things. E.g., that there exist an infinite number of beings who have all the memories of my brain up to 18 April 2012, including everything I’ve seen from Hubble and WMAP, but then in their memories they wake up from a long dream and return to being a green-skinned dancing girl from Orion.

(Alas, that is the fashion these days, populating the loneliness with shadow selves, frozen in branches of a stupefying state vector, floating in bubbles of spacetime frothed up by eternal inflation, or recorded in the memories of poor delusional Boltzmann Brains. Or, if you want to be particularly trendy, all of the above. Every variation of Hitler winning the war and von Stauffenberg’s bomb going off as planned, never the histories to meet. Every book in the Library of Babel a biography, innumerably many times over. Infinite iterations of Zhuangzi dreaming he is a butterfly; infinite butterflies dreaming they are Zhuangzi. Is this what we got into science for?)

But is that actually a meaningful number to compute, with quantum theory as it stands? What’s the point of asking, “What are the potential consequences to me of my experimental intervention into this phenomenon?” if the phenomenon in question is, by definition, inaccessible, both to myself and to my posterity?

I’ve read at least one cosmology person, Tom Banks, saying that Boltzmann-brainology is “silly.” His position was essentially that we can modify our physical theories in an infinite number of ways consistent with all available data and making the same predictions for all conceivable experiments, but with different numbers of Boltzmann Brains coming out. In addition, any detector sent out to gather data on Boltzmann Brains would disintegrate by quantum fluctuations itself long before it stood a chance of spotting any. . . but I think the issue is more fundamental than that. You’re asking a question which the theory is not prepared to answer! Sometimes, it’s obvious when that happens, like trying to calculate the self-energy of an ideal point electron and getting infinity, but here, people mostly don’t seem to be thinking of that possibility. If they do think the answer is absurd, they try to screw around with general relativity and invent a new cosmology that way.

Maybe physicists are generally accustomed to thinking about “limits to the validity of quantum mechanics”—if they believe any such exist at all—in an unproductive kind of way? Having prematurely excised the active agent, we naturally think “QM might fail for objects larger than the Planck mass” or something like that, rather than “QM is the wrong tool for answering questions divorced from agent experience”.

UPDATE (16 May 2012): I thank Howard Barnum for pointing out a misdirected hyperlink.