There was some water-cooler talk around the office this past week about a paper by Masanes, Galley and Müller that hit the arXiv, and I decided to write up my thoughts about it for ease of future reference. In short, I have no reason yet to think that the math is wrong, but what they present as a condition on *states* seems more naturally to me like a condition on *measurement outcomes.* Upon making this substitution, the Masanes, Galley and Müller result comes much closer to resembling Gleason’s theorem than they say it does.

So, if you’re wanting for some commentary on quantum mechanics, here goes:

Gleason’s theorem begins with a brief list of postulates, which are conditions for expressing “measurements” in terms of Hilbert spaces. To each physical system we associate a complex Hilbert space, and each measurement corresponds to a resolution of the identity operator — in Gleason’s original version, to an orthonormal basis. The crucial assumption is that the probability assigned to a measurement outcome (i.e., to a vector in a basis) does not depend upon which basis that vector is taken to be part of. The probability assignments are “noncontextual,” as they say. The conclusion of Gleason’s argument is that any mapping from measurements to probabilities that satisfies his assumptions must take the form of the Born rule applied to some density operator. In other words, the theorem gives the set of valid states *and* the rule for calculating probabilities given a state.

(It is significantly easier to prove the POVM version of Gleason’s theorem, in which a “measurement” is not necessarily an orthonormal basis, but rather any resolution of the identity into positive semidefinite operators, $\sum_i E_i = I$. In this case, the result is that any valid assignment of probabilities to measurement outcomes, or “effects,” takes the form $p(E) = {\rm tr}(\rho E)$ for some density operator $\rho$. The math is easier; the conceptual upshot is the same.)

I have a sneaky suspicion that a good many other attempted “derivations of the Born rule” really amount to little more than burying Gleason’s assumptions under a heap of dubious justifications. MGM don’t quite do that; what they present is more interesting.

They start with what they consider the “standard postulates” of quantum mechanics, which in their reckoning are five in number. Then they discard the last two and replace them with rules of a more qualitative character. Their central result is that the discarded postulates can be re-derived from those that were kept, plus the more qualitative-sounding conditions.

MGM say that the assumptions they keep are about state space, while the ones they discard are about measurements. But the equations in the three postulates that they keep could just as well be read as assumptions about measurements instead. Since they take measurement to be an operationally primitive notion — fine by me, anathama to many physicists! — this is arguably the better way to go. Then they add a postulate that has the character of noncontextuality: The probability of an event is independent of how that event is embedded into a measurement on a larger system. So, they work in the same setting as Gleason (Hilbert space), invoke postulates of the same nature, and arrive in the same place. The conclusion, if you take their postulates about complex vectors as referring to measurement outcomes, is that “preparations” are dual to outcomes, and outcomes occur with probabilities given by the Born rule, thereupon turning into new preparations.

Let’s treat this in a little more detail.

Here is the first postulate of what MGM take to be standard quantum mechanics:

To every physical system there corresponds a complex and separable Hilbert space $\mathbb{C}^d$, and the pure states of the system are the rays $\psi \in {\rm P}\mathbb{C}^d$.

We strike the words “pure states” and replace them with “sharp effects” — an equally undefined term at this point, which can only gain meaning in combination with other ideas later.

(I spend at least a little of every working day wondering why quantum mechanics makes use of complex numbers, so this already feels intensely arbitrary to me, but for now we’ll take it as read and press on.)

MGM define an “outcome probability function” as a mapping from rays in the Hilbert space $\mathbb{C}^d$ to the unit interval $[0,1]$. The abbreviation OPF is fine, but let’s read it instead as *operational preparation function.* The definition is the same: An OPF is a function ${\bf f}: {\rm P}\mathbb{C}^d \to [0,1]$. Now, though, it stands for the probability of obtaining the measurement outcome $\psi$, for each $\psi$ in the space ${\rm P}\mathbb{C}^d$ of sharp effects, given the preparation ${\bf f}$. All the properties of OPFs that they invoke can be justified equally well in this reading. If ${\bf f}(\psi) = 1$, then event $\psi$ has probability 1 of occurring given the preparation ${\bf f}$. For any two preparations ${\bf f}_1$ and ${\bf f}_2$, we can imagine performing ${\bf f}_1$ with probability $p$ and ${\bf f}_2$ with probability $1-p$, so the convex combination $p{\bf f}_1 + (1-p){\bf f}_2$ must be a valid preparation. And, given two systems, we can imagine that the preparation of one is ${\bf f}$ while the preparation of the other is ${\bf g}$, so the preparation of the joint system is some composition ${\bf f} \star {\bf g}$. And if measurement outcomes for separate systems compose according to the tensor product, and this $\star$ product denotes a joint preparation that introduces no correlations, then we can say that $({\bf f} \star {\bf g})(\psi \otimes \phi) = {\bf f}(\psi) {\bf g}(\phi)$. Furthermore, we can argue that the $\star$ product must be associative, ${\bf f} \star ({\bf g} \star {\bf h}) = ({\bf f} \star {\bf g}) \star {\bf h}$, and everything else that the composition of OPFs needs to satisfy in order to make the algebra go.

Ultimately, the same math has to work out, after we swap the words around, because the final structure is self-dual: The same set of rays ${\rm P}\mathbb{C}^d$ provides the extremal elements both of the state space and of the set of effects. So, if we take the dual of the starting point, we have to arrive in the same place by the end.

But is either choice of starting point more *natural*?

I find that focusing on the measurement outcomes is preferable when trying to connect with the Bell and Kochen–Specker theorems. In the former, the “preparation” is fixed, while in the latter, it can be arbitrary, but either way, we don’t say much *interesting* about it. The action lies in the choice of measurements, and how the rays that represent one measurement can interlock with those for another. So, from that perspective, putting the emphasis on the measurements and then deriving the state space is the more “natural” move. It puts the mathematics in conceptual and historical context.

That said, on a deeper level, I don’t find either choice all that compelling. To appreciate why, we need only look again at that arcane symbol, ${\rm P}\mathbb{C}^d$. That is the setting for the whole argument, and it is completely opaque. Why the complex numbers? Why throw away an overall phase? What is the meaning of “dimension,” and why does it scale multiplicatively when we compose systems? (A typical justification for this last point would be that if we have $n$ completely distinct options for the state of one system, and we have $m$ completely distinct options for the state of a second system, then we can pick one from each set for a total of $nm$ possibilities. But what are these options “completely distinct” with respect to, if we have not yet introduced the concept of measurement? Why should dimension be the quantity that scales in such a nice way, if we have no reason to care about vectors being orthogonal?) All of this cries out for a deeper understanding.