$$

\vec{p} = \vec{\nabla} S \, .

$$

We don’t know what $S$ is, except that it’s a field whose value presumably depends upon position and time, and it ought to satisfy some equation. Can we find an equation for it?

Well, from mechanics we know that the time derivative of the momentum is the force. And in the absence of friction, we can write the force as minus the gradient of the potential energy, which feels relevant here:

$$

\frac{d\vec{p}}{dt} = -\vec{\nabla} V \, .

$$

Combining these equations, we get

$$

\frac{d}{dt} \vec{\nabla} S = -\vec{\nabla} V \, .

$$

But wait! A particle riding along in a field sees *two* contributions to the field value changing. If the particle is sitting still, then the field value at its current spot will change if the field itself changes. A *moving* particle can also just move over to a spot where the field value is different, even if the field is constant over time. We learn how to cover this in hydrodynamics, by splitting the derivative into two terms, one with a velocity dependence:

$$

\frac{d}{dt} \vec{\nabla} S

= \frac{\partial}{\partial t} (\vec{\nabla} S)

+ (\vec{v}\cdot\vec{\nabla}) \vec{\nabla} S \, .

$$

And *this* is what equals minus the gradient of the potential.

(Where does that dot product $\vec{v}\cdot\vec{\nabla}$ come from? Well, imagine we’re in one dimension, moving through a field $f$. In a tiny interval of time $dt$, we move a distance $v\,dt$, and over this distance, the field value changes by an amount $(v\,dt)(\partial f/\partial x)$. Extending this to three dimensions gives the expression above.)

Now, we can write the $\vec{v}$ here as the momentum divided by the mass $m$, and we have declared that the momentum is the gradient of our mystery field. Therefore,

$$

\frac{\partial}{\partial t}(\vec{\nabla} S)

+ \frac{1}{m} (\vec{\nabla}S \cdot \vec{\nabla})

\vec{\nabla} S

= -\vec{\nabla} V \, .

$$

Let’s exchange the order of differentiation on that first term:

$$

\vec{\nabla} \left(\frac{\partial}{\partial t} S\right)

+ \frac{1}{m} (\vec{\nabla}S \cdot \vec{\nabla})

\vec{\nabla} S

= -\vec{\nabla} V \, .

$$

It looks almost like each term can be the gradient of something. So, we stare at the one that isn’t for a while, until we remember from calculus that

$$

\frac{d}{dx} \left[\left(\frac{df}{dx}\right)^2\right]

= 2 \frac{df}{dx} \frac{d^2 f}{dx^2} \, .

$$

By the same logic, we find that our equation for $S$ is what we’d get by taking the gradient of both sides of

$$

\frac{\partial S}{\partial t}

+ \frac{1}{2m} (\vec{\nabla}S)^2 = -V \, .

$$

Or, isolating the time derivative and flipping the signs to give it a more conventional presentation,

$$

-\frac{\partial S}{\partial t}

= \frac{1}{2m}(\vec{\nabla} S)^2 + V \, .

$$

The *Hamilton–Jacobi equation* just generalizes this! We can write it as

$$

-\frac{\partial S}{\partial t} = H(\vec{x}, \vec{\nabla} S, t) \, .

$$

Another common notation is to write $q$ for position instead of $\vec{x}$, suppressing the little arrow and just remembering that $q$ can be a whole list of coordinates (for multiple particles, potentially). Then

$$

-\frac{\partial S}{\partial t}

= H\left(q, \frac{\partial S}{\partial q}, t \right) \, ,

$$

with

$$

p = \frac{\partial S}{\partial q} \, .

$$

This discussion was inspired by a 1965 paper of Nathan Rosen. We have actually followed Rosen’s logic in reverse, starting with the familiar instead of justifying the unfamiliar by transforming it until it looks recognizable. A more conventional treatment can be found in José and Saletan’s *Classical Dynamics: A Contemporary Approach* (2012), section 6.1.

OK, what is its free replacement?

A variant on this question: How much of the MIT undergraduate physics curriculum can be taught with free books? The only reasonable answer would be *all of it,* because we’ve had the Web for 30 years now. Sadly, the textbook business is not reasonable.

If people had decided to be useful at any point in the past generation, you could go to physics.mit.edu and click to download all-the-textbooks-you-need.tgz, but we got MOOCs instead. Not to mention the “open courseware” that too much of the time is just a stack of PowerPoints. Oh, and software that puts kids under surveillance so that a company can monetize their behavior. Because that’s the future we deserved, right?

There *are* books out there, but they peter out after you get past the first year or so, and a lot is pitched either too low or too high. Either there’s a few chapters in a big “university physics” kind of volume that wouldn’t be enough to fill a whole semester, or there’s a substantial text that’s intended for graduate students. Plenty of times, one finds a totally decent set of lecture notes that whiffs at the last step by not incorporating homework problems. If we really want institutional change, we need (among other things) more drop-in replacements for the books to which physicists habitually turn, so that we can overcome the force of tradition.

In what follows, I go through the MIT course catalogue and provide links and commentary.

**Mechanics I: “Newtonian” physics**

Yes, yes, $F = ma$ was from Euler, not Newton, the concept of energy wasn’t really codified until du Châtelet, etc. Calling the subject “Newtonian” mechanics is an oversimplification, but a readily-understood one.

- Moebs, Ling, Sanny et al.,
*University Physics*volume 1 (OpenStax). - Feynman, Leighton and Sands,
*The Feynman Lectures on Physics,*volume 1. These lectures have a reputation for being difficult to learn out of, if you’re learning for the first time, but inspirational if you have some grasp of the subject already. I personally suspect that their reputation for difficulty is partly undeserved. The troubles Caltech had with the course at the beginning sound like the start-up challenges of every course I’ve ever taken that was being offered for the first time. The big problem, I think, is that the exercises are not smoothly integrated into the text and don’t provide a manageable gradation from easy and bite-sized to demanding and hearty. Of course, much of what was “state of the art” in 1964 is so no longer, and if you’re burnt out on the Feynman-industrial complex, I can appreciate that too. - Chakrabarty et al., 8.01 (MIT OpenCourseWare).

The more accelerated version of this course, intended for students who had a stronger math background going in, reached topics that the OpenStax book doesn’t. In particular, it could go as far as solving the Kepler problem, which the *Feynman Lectures* themselves don’t do.

- Egan, Conic Section Orbits.
- Goodstein et al.,
*The Mechanical Universe*episode 22: The Kepler Problem.

**Calculus I: Single-variable**

- Thompson,
*Calculus Made Easy*. This covers, I believe, everything that “calculus-based” first-year mechanics requires. It’s maybe half of what MIT’s introductory semester of calculus includes on the advanced track, and somewhat more than half of a less ambitious course with the same name. - Strang, Herman et al.,
*Calculus*volume 1 and volume 2 (OpenStax). - Keisler,
*Elementary Calculus: An Infinitesimal Approach*.

**Electromagnetism**

In my day, this was taught out of Purcell, as opposed to the more advanced course that one could take as an elective, which was based on Griffiths with inflections of Jackson. A drop-in replacement with that level and scope still seems hard to come by.

- Moebs, Ling, Sanny et al.,
*University Physics*volume 2 (OpenStax). Good so far as it goes, but lacks the differential version of the Maxwell equations and the introduction to special relativity that we got.

**Calculus II: Multivariable**

- Strang, Herman et al.,
*Calculus*volume 3 (OpenStax). - Cain and Herod,
*Multivariable Calculus*.

**Waves and Vibrations**

- Lee and Georgi, 8.03sc (MIT OpenCourseWare).

**Relativity**

- Moebs, Ling, Sanny et al.,
*University Physics*volume 3 (OpenStax). The one chapter on the topic is fine, as an appetizer. - Mermin, From Einstein’s 1905 Postulates to the Geometry of Flat Space-Time.
- Taylor and Wheeler,
*Spacetime Physics,*second edition.

**Differential Equations**

- Terrell,
*Notes on Differential Equations*.

**Quantum I**

The first semester of quantum physics was mostly an attempt to teach the subject without linear algebra. If you made it out knowing how to solve the particle-in-a-box and the harmonic oscillator, you were on track.

- Zwiebach, 8.04 (MIT OpenCourseWare).
- Feynman, Leighton and Sands,
*The Feynman Lectures on Physics,*volume 3. Oddly, given the reputation for being intimidating, the background presumed by volume 3 puts it more at this level than the second or third semesters of quantum mechanics that follow. However, the lack of exercises relegates this to supplemental reading.

**Statistical I**

What I think we need here is an OA counterpart roughly comparable to, e.g., *Finn’s Thermal Physics.* The online resources I’ve managed to turn up so far have pitched at a higher level than the spot we’d need to fill here. I recently taught a thermo course for undergrads in the early part of a physics major, and a lot of online lecture notes seem to rush through in a single chapter what we spent half the semester on.

- Moebs, Ling, Sanny et al.,
*University Physics*volume 2 (OpenStax). Fine for a first encounter with macroscopic, equilibrium thermodynamics, but it stops before the topic of free energy and is lacking on the statistical-physics perspective.

**Quantum II**

- Zwiebach, 8.05 (MIT OpenCourseWare).
- Mermin, Hidden variables and the two theorems of John Bell. Not a book, but a review article that can probably be appreciated at this level.

**Laboratory I**

Good grief, this was brutal. And not for any particularly good reason. On top of the intrinsic difficulties, like lab equipment never working right and having to master a new topic every three weeks as you went from one experiment to the next, this was also most students’ first encounter with the “job skills” of physics: fitting curves to data, doing a literature review, giving a technical talk, writing a technical paper, etc.

A lab course might not have a textbook at all, per se. Instead, most of the reading will be historical papers about whatever pivotal experiment you are trying to replicate, along with manuals for the equipment you are struggling to use.

More than anywhere else on this list, this is where the openness of *software* becomes a concern. Locking students into doing data analysis with proprietary tools is, on the whole, a bad move. (And if the extent of the instruction is “here’s some MATLAB, good luck”, doing the same but with Python instead is no less friendly.)

**Statistical II**

- Sethna,
*Statistical Mechanics: Entropy, Order Parameters, and Complexity*. - Likharev, Part SM: Statistical Mechanics. This is officially targeted at graduate students, but portions may be useful at the advanced undergraduate level.

**Laboratory II**

I did better in the second semester of Junior Lab than I did in the first, partly because we had more time for each experiment, but mostly because it took me the whole first semester to figure out what the heck I was doing.

**Mechanics II**

This is the first encounter with Lagrangian and Hamiltonian methods. I am still looking for a reference on this material at a suitable level that I’m really happy with, but David Tong’s lecture notes come close.

**Mechanics III**

- Stewart, 8.09 (MIT OpenCourseWare). The Hamilton–Jacobi language is introduced here in the standard way, i.e., presuming a fairly in-depth knowledge of canonical transformations. Students who have already had to master two whole formalisms are generally too tired for this. However, the path can be eased slightly by explaining how to go directly from the “Newtonian” to the Hamilton–Jacobi presentation.

**Math Elective**

The physics department requires each student to wander over to the mathematics department for one elective beyond differential equations. Most students fill this slot with linear algebra, I believe, though I went with real analysis for some reason.

- Matthews,
*Elementary Linear Algebra*. - Hefferon,
*Linear Algebra*. - Beezer,
*A First Course in Linear Algebra*.

**Quantum III**

- Harrow, 8.06 (MIT OpenCourseWare).

**E&M II**

- Fitzpatrick, Classical Electromagnetism: An intermediate level course. These lecture notes are split across many small pages, which may be exactly what some readers who aren’t me want. Clear, but terse: the words connecting the equations are kept to a minimum, but not reduced beyond that. Lacks exercises (unless, like me, you have to verify that everything which “follows easily” really does follow easily).
- Feynman, Leighton and Sands,
*The Feynman Lectures on Physics,*volume 2. More demanding than volume 3 in some respects. All the caveats mentioned above apply here, too.

`ALL THESE BOOKS ARE YOURS`

EXCEPT EUROPA

USE THEM TOGETHER

USE THEM IN PEACE

It’s a memoir by someone who just doesn’t want or like sex all that much. (Representative dialogue from page 138: “I think I’m asexual.” “You can’t be, I’ve seen you lust after other people.” “Well. Yeah. But not very often and I don’t enjoy it.”) Oh, noes, three panels of mostly-clothed fooling around by two people in an affectionate, monogamous relationship that ends with them deciding that the activity was hotter in the anticipation than the actuality. That’s roughly one one-billionth as steamy as anything Famke Janssen says or does in *GoldenEye.*

I’ve taught this to college students, after first reviewing how complex numbers work and some basics about how to manipulate matrices — adding them, multiplying them, taking the trace and the determinant, what eigenvalues and eigenvectors are.

For our own purposes, our next step will be to develop the framework in which we can consider multiple qubits together. It might not seem obvious now, but a good way to make progress is to combine our three expected values $(x,y,z)$ into a matrix, like so:

$$ \rho = \frac{1}{2} \begin{pmatrix} 1 + z & x – iy \\ x + iy & 1 – z \end{pmatrix} \, . $$

This matrix has some nice properties of the sort that we can generalize to *bigger* matrices. For example, its trace is 1, which feels kind of like how a list of probabilities sums up to 1. Meanwhile, the determinant is the pleasingly Pythagorean quantity

$$ \det\rho = \frac{1}{4}(1 – x^2 – y^2 – z^2) \, . $$

This will be nonnegative for all the valid preparation points. So, the product of the two eigenvalues of $\rho$ will be positive for every point in the interior; we can only get a zero eigenvalue by picking a point on the surface. Using the trace and the determinant, we can find the eigenvalues thanks to a nifty application of the quadratic formula:

$$ \lambda_\pm = \frac{\mathrm{tr}\rho \pm \sqrt{(\mathrm{tr}\rho)^2 – 4\det\rho}}{2} = \frac{1}{2}(1 \pm \sqrt{x^2 + y^2 + z^2}) \, . $$

And indeed, this will always give us positive real numbers, except on the surface of the Bloch ball where the $\lambda_+$ solution is 1 while the $\lambda_-$ solution is 0. Requiring that a matrix’s eigenvalues be nonnegative is another property we can generalize.

Another interesting thing happens if we take the square of $\rho$:

$$ \rho^2 = \frac{1}{4}

\begin{pmatrix} 1 + x^2 + y^2 + z^2 + 2z

& 2x – 2iy \\

2x + 2iy

& 1 + x^2 + y^2 + z^2 – 2z

\end{pmatrix} \, . $$

If the point $(x,y,z)$ is on the surface of the sphere, then $\rho^2 = \rho$. This will turn out to be a way to characterize the extreme elements in our set of valid preparations, no matter how big we make our matrices.

This gets us almost to the point of being able to do the quantum math for the parable of the muffins.

]]>“OK,” you might say, while wondering what the big deal is.

“In fact, I am going to measure *all speeds* as the time it takes to travel a standard unit of distance.”

“Uh, hold on.”

“And this means that, contrary to what you learned in Big University, zero is not a speed! Because the right way to think of speed is the time it takes to travel 1 standard distance unit, and an object that never moves never travels.”

Now, you might try to argue with me. You could try to point out all the things that my screwy definition would break. (For starters, I am throwing out everything science has learned about inertia.) You could try showing examples where scientists I have praised, like Feynman or whoever, speak of “a speed equal to zero”. When all that goes nowhere and I dig in further with every reply, you might justifiably conclude that I am high on my own supply, in love with my own status as an iconoclast. Because that is my *real* motivation, neither equations nor expertise will sway me.

Yes, it’s time for another installment in my occasional series, *Friends Don’t Let Friends Learn Topic X from Eliezer Yudkowsky.* For those who don’t know, Yudkowsky is an autodidact and fanfiction writer who, like E. L. James, portrays insufferable characters as admirable and thereby gives the whole medium a bad name. Unlike James, he also fills his work with bad science. Because he scratches an emotional itch for people enamored of the idea that they are above emotion, he has become influential in circles you would rather avoid.

Among many other things that Yudkowsky has famously attempted to explain is the concept of probability. The bit I want to zoom in upon today is the time that he argued that 0 and 1 are not probabilities. He grounds this headscratcher in the statement that you can’t turn a probability of 1 into a ratio by the function $f(p) = p/(1-p)$, because you’d be dividing by 0. This and everything that followed is just getting high off his own supply. One could try showing how he presumes his own conclusion. One could try showing how he breaks the basic idea that probabilities by their nature add up to 100% (given an event *E*, what can Yudkowsky say is the probability of the event *E*-or-not-*E*?). One could even observe that the same E. T. Jaynes he praises in that blog post uses 1 as a probability, for example in Chapter 2 of *Probability Theory: The Logic of Science* (Cambridge University Press, 2003). If you really want to cite someone he admires, you could note that Eliezer Yudkowsky uses 1 as a probability when trying (and failing) to explain quantum mechanics, because he writes probability amplitudes of absolute value 1.

As an academic, I have to hold myself back from developing all those themes and more. But the additional wrongness that comes in when he turns to quantum mechanics is worth pausing to comment upon.

Yudkowsky loves to go on about how *the map is not the territory,* to the extent that his fandom thinks he coined the phrase, but he is remarkably terrible at understanding which is which. Or, to be a little more precise, he is actively uninterested in appreciating that the question of what to file under “map” versus “territory” is one of *the* big questions that separate the different interpretations of quantum mechanics. He has his desired answer, and he argues for it by assertion.

He’s also just ignorant about the math. Stepping back from the details of what he gets wrong, there are bigger-picture problems. For example, he points to a complex number and says that it can’t be a probability because it’s complex. True, but so what? The Fourier transform of a sequence of real numbers will generally have complex values. Just because one way of expressing information uses complex numbers doesn’t mean that every perspective on the problem has to. And, in fact, what he tries to do with two complex numbers — one amplitude for each path in an interferometer — you can actually do with three real numbers. They can even be probabilities, say, the probability of getting the “yes” outcome in each of three yes/no measurements. The quantumness comes in when you consider how the probabilities assigned to the outcomes of different experiments all fit together. If probabilities are, as Yudkowsky wants, always part of the “map”, and a wavefunction is mathematically equivalent to a set of probabilities satisfying some constraint, then a wavefunction belongs in the “map”, too. You can of course argue that *some* probabilities are “territory”; that’s an argument which smart people have been having back and forth for decades. But that’s not what Yudkowsky does. Instead, through a flavor swirl of malice and incompetence, he ends up being too much a hypocrite to “steelman” the many other narratives about quantum mechanics.

The one thing in Thompson’s presentation that I didn’t particularly like is how he introduces derivatives of trig functions. It presumes that the reader has a lot of trig identities in their back pocket, and it makes a simplification that is hard to justify without going into limits, a topic that Thompson doesn’t explicitly teach. I’ve tried my hand at a replacement that appeals to the way he *does* teach.

Further modifications may come as people apprise me of all the things I missed. I do wish to keep it short and sweet, rather than adding multiple new chapters.

**EDIT TO ADD (12 March 2024):** I had a Lulu.com account from a print-on-demand project ages ago, and poking around didn’t find any obviously better options, so I ordered some copies from there. I deem them good enough and have made the project available for purchase at cost.

As I have written before, it is very difficult to provide substantive criticism of a “theory” that has no substance. I could point to individual things that make no sense, but the people who care don’t need my help, and the people who don’t won’t be convinced by anything I say. (“I asked ChatGPT to summarize the paper, and I found the results quite inspirational!”) I could try to provide a little media literacy, like *feel free to ignore any science “news” that’s just a press release from the guy who made it up.* But again, if you’re thirsty for something else, that will hardly satisfy. (“Reality is all on the blockchain, buy GameStop!”)

I promise this is going somewhere.

A certain bakery has a special deal on muffins. They sell mystery boxes for those who like to live dangerously: mix-and-match sets of three muffins apiece. Each day, Alice, Bob and Charlie buy a mystery box together, and each day, Alice, Bob and Charlie take one muffin apiece back to their respective laboratories for analysis. They each have two testing devices — say, a device that can test whether a muffin is positive for dairy, and another device that can test whether it is positive for tree nuts. We’ll call these $X$ tests and $Y$ tests for short. Each day, Alice chooses either to do an $X$ test or a $Y$ test. Bob likewise chooses, independently of Alice, and so does Charlie. Importantly, each muffin can only be tested *once.* Maybe the test destroys the muffin, or maybe it takes so long to do one that they eat their muffins immediately afterward. Whatever the rationale, one test per muffin — that’s a rule of the parable.

We can write what they choose to do in a compact way. For example, if all three of them choose to do the $X$ test on their respective muffins, we’ll write $A_X B_X C_X$. If Bob and Charlie choose to do the $Y$ test but Alice goes instead with the $X$ test, we’ll write $A_X B_Y C_Y$. And so on. We can also write the results compactly, using $+1$ to stand for a positive result and $-1$ to stand for a negative one. (We could also record the outcomes with zeros and ones, or with trues and falses, greens and blues, etc. Using $+1$ and $-1$ is just a notation that will turn out to be helpful in a moment.) So, for example, if Alice chooses $Y$, Bob chooses $X$ and Charlie goes with $Y$, the results might be $(+1, -1, -1)$. Or they might be $(+1, +1, +1)$, or perhaps $(-1, +1, -1)$.

Over many days of muffin investigation, comparing their notes, they find a dependable pattern. Whenever *two of them* choose to do the $Y$ test, then the *product of their results* is always $+1$. The specific outcome varies randomly from day to day, but there’s never only one $-1$, and they never get all three results being $-1$. From this pattern, they can draw a couple conclusions. First, once two of them obtain their results, the result of the third is predictable. Let’s say their choices are $A_Y B_X C_Y$, as in the previous example, and both Alice and Bob get the result $+1$. Then we can predict that Charlie will get $+1$, because that’s the only way the product of the three numbers can be $+1$. Or, suppose their choices are $A_Y B_Y C_X$, and both Bob and Charlie get a $-1$. The two of them report their results and wait for Alice. Knowing Bob and Charlie’s results, we can predict that Alice will report a $+1$ outcome, because that’s the only way the product of the three outcomes is $+1$. A minus times a minus makes a plus, and so a third minus would spoil the plus.

If we had used a different notation for the outcomes, like “green” and “blue” instead of $+1$ and $-1$, then we could express this pattern by saying that whenever two of them choose to do the $Y$ test, an even number of the results will be blue.

Now, we deduce something else from the pattern. We can make a prediction about what happens under very different conditions. What about the days when all three choose to measure $X$?

We’ve used capital letters to write their choices. Let’s use lowercase letters to denote the *properties* of the muffins being measured. An $A_X$ measurement — that is, Alice doing the $X$ test — uncovers the value of $a_X$. Likewise, Bob doing the $Y$ test reveals the $b_Y$ muffin property, and so on. The pattern so far is that

$$ a_X b_Y c_Y = +1, $$

and

$$ a_Y b_X c_Y = +1, $$

and

$$ a_Y b_Y c_X = +1. $$

Here comes the neat trick. We multiply all three of these facts together.

$$ (a_X b_Y c_Y)(a_Y b_X c_Y)(a_Y b_Y c_X) = +1. $$

We rearrange the variables, putting like with like:

$$ a_X b_X c_X a_Y^2 b_Y^2 c_Y^2 = +1.$$

On any given day, we don’t know the value of $a_Y$ or $b_Y$ or $c_Y$ before the test, and if Alice chooses $X$ instead of $Y$, then we’ll never learn $a_Y$ for that day, and likewise for Bob and Charlie. But the values do have to be waiting there, don’t they? Each has to be either $+1$ or $-1$, even if we never learn which: That’s just a fact determined at the bakery.

The square of $-1$ is the same as the square of $+1$, so on every day,

$$ a_Y^2 = b_Y^2 = c_Y^2 = +1. $$

Therefore, we can drop all the $Y$ factors from our previous equation!

$$ a_X b_X c_X = +1. $$

And now we have a prediction: on those days when all three independently choose to do the $X$ test, the product of their answers will be

$$ A_X B_X C_X = +1. $$

This follows from the dependability of the pattern found on the two-$Y$-test days and the basic assumption that the tests are testing properties intrinsic to the muffins, properties waiting to be found. The tests don’t have to be for dairy and tree nuts — any properties will do. The objects being tested don’t have to be muffins, either. Anything that can come in sets of three is suitable. From the dependable pattern in the two-$Y$-test events, we can draw a conclusion about the triple-$X$-test events. The product of the $X$ results is going to be $+1$.

How general this result is! From the one pattern, we deduce the other, whether we find that first pattern in muffins, cookies, apples, sand grains…

Electrons? Photons?

Ay, there’s the rub.

It is possible to make triplets, not of baked goods but of subatomic particles, that fit the two-$Y$-test pattern. After doing the $Y$ test on the first two particles, for example, the result of an $X$ test is predictable, using the same rule we described above.

*But the prediction about what happens when each particle gets the $X$ test does not hold.*

The rules of quantum mechanics imply that there is a way an experimenter can prepare triplets of particles such that they should predict that the product of an $X$ result and two $Y$ results will always be $+1$, while the product of three $X$ results will always be $-1$.

By rights, this ought to be impossible. But the imagination of nature is more subtle than our own!

Doing the calculation to get the quantum-mechanical answer is not all that hard. It’s a spot of matrix algebra that’s doable by hand and doesn’t require knowing much more than what an “eigenvector” is. (The readings below have more details.) The much more difficult part is knowing what to make of it!

The predictions of quantum mechanics — which have been checked in the lab, directly and indirectly, and found to work superbly — clash with the basic and seemingly bulletproof calculation we have done here. How can that calculation fail to apply? Where is the gap that means one pattern doesn’t have to imply the other? If an argument whose fundamental premise is that *measurements record a property of the thing being measured* is inconsistent with quantum physics, our spectacularly successful guide to living in reality, then what does that say about reality?

Well, ask two rabbis and you’ll get three opinions. It’s not even clear what making progress on that kind of “what does it all mean?!” question even looks like. If people disagree on the meaning but all use the same math to make the same predictions, on what grounds can we even prefer one proposal about the meaning over another? Gut reaction? Preserving intuitions from classical physics? Remember, classical physics is what turned out to be wrong, so we had to invent quantum mechanics to fix it!

Perhaps one way to keep the question from going stale and vacuous is to find a new way of expressing quantum theory itself. After all, there’s no grand guarantee that the formulation of the theory which is good for finding the spectrum of atomic hydrogen (and all the other topics we traditionally assign to undergraduates each semester) is equally good for all questions. And one feature of the standard presentation is that the really weird stuff, such as what we’ve confronted today, tends to be buried several chapters in. When the books lay out “axioms” or “postulates” for quantum mechanics, blatant defiance of intuition about intrinsic properties isn’t among them. Instead, it is deduced as a consequence, deep into the algebra. But perhaps that is a historical accident, due to the way we got here and not to nature itself!

**READINGS**

The parable of the muffins is my attempt at rephrasing a thought-experiment by N. David Mermin, who learned of a *four*-particle scenario devised by Greenberger, Horne and Zeilinger and found that it could be significantly sharpened by going down to three.

- N. David Mermin, “Quantum mysteries revisited,”
*American Journal of Physics,***58**(1990), 731–734. DOI:10.1119/1.16503. - N. David Mermin, “Hidden variables and the two theorems of John Bell,”
*Reviews of Modern Physics***65**(1993), 803–815. arXiv:1802.10119.

For reasons that made sense at the time, I gathered all the homework problems together at the end into a chapter called, well, “Exercises”. And now I keep getting spam invitations to conferences and special issues of journals no one has ever heard of, asking me to share my pivotal work “in the field of Exercises”.

]]>In a 1964 interview, the physicist Karl Darrow calls the story “impossible to check”. And in another interview, Robert Mulliken (not to be confused with Robert Millikan) shares the story of Lunn having “sent a paper to the Physical Review which was turned down and which anticipated the quantum mechanics”. Mulliken heard the story from the physical chemist William Draper Harkins. Similarly, Leonard Loeb told Thomas Kuhn that Lunn “was probably a misunderstood genius, and who was completely frustrated, because his one great paper with his one great idea was turned down by a journal”.

Lunn did apparently try to present what sounds like a grandiose paper (“Relativity, quantum theory, and the wave theories of light and gravitation”) at the American Physical Society meeting in April 1923, but his paper was only “read by title”. The abstract ran as follows:

This paper is a preliminary report on a theory originally sought in order to meet the recognized need for a reconciliation between wave theory and quantum phenomena; its scope of adaptation proves to be quite wide. It includes (1) a wave theory of gravitation in quantitative connection with optical, electronic, and radioactivity data; (2) a related general suggestion of a theory connecting molecular properties with properties of matter in bulk; (3) alternatives for some of the current features in the theories of atomic structure; (4) a new interpretation and deduction of formulas for series and band spectra, using in lieu of the quantum condition a substitute directly related to long familiar physical notions; (5) a modification of Lagrangian dynamics which promises to be of service in the study of complex atomic and molecular structures; (6) a non-quantum theory of specific heat and black radiation. Results so far reached deal mostly with problems approachable by elementary methods or approximate computations. A set of formulas has been obtained which yield computation of the electron constants $e$, $h$, $m$ and mass ratios, assuming from observation only the Rydberg constant, velocity of light, gravitation constant, and Faraday constant, with results in each case in practical agreement with measured values.

Darrow says, “I know that in 1924 he wanted to give a twenty or a thirty minute paper before the American Physical Society in Washington, but then authorities of the Society refused him more than ten minutes”.

Lunn’s abstract in the 1924 proceedings has a similar explain-everything atmosphere:

Relativity, the quantum phenomena, and a kinematic geometry of matter and radiation.A. C. LUNN, University of Chicago. The theory indicated in an earlier paper (Phys. Rev. 21, 711, 1923), has since been developed, extended in scope, and so ordered as to permit of treatment as a deductive space-time geometry. It unites the treatment of the quantum phenomena with the rest of physical theory in a way that yields to illustration by familiar physical images. It resolves into matters of choice a number of hitherto controversial alternatives in the interpretation of phenomena, and allows freedom of use of a range of concrete types of representation including many other concepts commonly discarded. Among special topics more recently found to affiliate with the scheme may be mentioned the Stark and Zeeman effects and fine structure, resonance potentials, and the intensity and distribution of general x-radiation. Improvements have been made in the setting of the formulas connecting $e$, $h$, and $m$ with pre-electron data. A program has emerged for the foundation of a trial mathematical chemistry by determination of types of atoms, valence, number of isotopes, atomic weights, and spectrum levels.

I can easily imagine a paper with that attempted scope being incomprehensible to whoever had the task of evaluating it, and so any really good morsels within it would have been lost.

**UPDATE (4 November):** I wrote to the *Physical Review* offices on the chance that they had more information and received this reply from Robert Garisto, the Managing Editor of *Physical Review Letters.*

]]>Thank you for your query. Our records from the early 20th century are fragmentary. I am not sure if we have any from before 1930, much less a complete set that could answer your question.

But I see that Arthur C. Lunn published 7 papers in the Physical Review from 1912-1922. So he was a known author to the editors. Those were different times, and while it is possible that he submitted a paper that was rejected and never published elsewhere, for what it’s worth, it strikes me as unlikely.

* * *

I saw the Count lying within the box upon the earth, some of which the rude falling from the cart had scattered over him. He was deathly pale, just like a waxen image, and the red eyes glared with the horrible vindictive look which I knew too well.

As I looked, the eyes saw the sinking sun, and the look of hate in them turned to triumph.

But, on the instant, came the sweep and flash of Jonathan’s great knife. I shrieked as I saw it shear through the throat; whilst at the same moment Mr. Morris’s bowie knife plunged into the heart. And a voice rang out across the sky: “KUKRI AND BOWIE COMBO MOVE, MOTHERFUUUCKER!”

— Bram Stoker, more or less

]]>Yudkowsky clearly intends to argue that the scientific community is broken and his brand of Rationalism(TM) is superior, but what he’s actually done is take all the weaknesses that physicists have when discussing quantum foundations and present them in a more concentrated form. There’s the accepting whatever mathematical formulation you learn first as the ultimate truth, the reliance upon oversimplified labels and third-hand accounts rather than studying what the pioneers themselves wrote, the general unwillingness to get out of the armchair and go even so far as the library…

Let’s open with Yudkowsky’s “If Many Worlds Had Come First“, where a fake version of Hugh Everett trades places with a fake version of Niels Bohr. Now, to me this sounds like a bizarrely overcomplicated rhetorical exercise, but if it is to be done correctly, then the fictional Bohr should espouse the view of the historical Everett and vice versa. But because it’s showboating, that’s not what we get.

First, we get some bizarre revisionism:

Macroscopic decoherence, a.k.a. many-worlds, was first proposed in a 1957 paper by Hugh Everett III.

No, decoherence was introduced by Zeh in 1970. And it took another decade for the idea to take off, thanks to Zurek coming along with work that was (a) fairly interpretation-neutral in its presentation and (b) originally inspired by trying to clarify Bohr rather than dethrone him. Nowadays, the theory of decoherence is recognized as a calculational tool that anybody can use regardless of their preferred interpretation of quantum physics, because it’s just applying the standard math to a particular class of situations, and all interpretations agree on the standard math. Using “macroscopic decoherence” as a synonym for “many-worlds interpretation” makes no logical sense.

Then, we get a transparent attempt to make Everett look good while academia looks bad:

Crushed, Everett left academic physics, invented the general use of Lagrange multipliers in optimization problems, and became a multimillionaire.

The whole point of Lagrange multipliers was always “optimization problems” (minimizing or maximizing a functional in the calculus of variations is optimization); this should be “operations research” or “management science”. More fundamentally, though, one could with equal justice say that Everett lacked the temperament to argue for his ideas, antagonized and scorned those who most closely agreed with him, and died miserable. See what I did there?

It wasn’t until 1970, when Bryce DeWitt (who coined the term “many-worlds”) wrote an article for Physics Today, that the general field was first informed of Everett’s ideas.

False. Everett’s paper was discussed at the Chapel Hill conference in 1957, where Feynman came down pretty harsh on it. And in 1959, Everett met with more people than just Bohr when he visited Copenhagen. In 1962, Everett presented his interpretation at a conference at the Xavier University of Cincinnati, with prominent physicists like Wigner in attendance. And of course, all this is in addition to the fact that Everett’s ’57 paper was published in the *Reviews of Modern Physics,* one of the most prominent journals of the physics profession. Why didn’t more people care until the ’70s? (shrug) It answered no specific question about a concrete physics problem, and as noted above, quite possibly Everett himself was just not the man to sell it. My own impression of that paper was that it had enough places where it just assumed the math works out that it needed at least one more round of revision (to be clear about the problems, if not to solve them, since nobody has done that yet). But I too am judging it with the benefit of hindsight.

And suppose that no one had proposed collapse theories until 1957.

Bohr did not propose a collapse theory.

Now, I actually stumbled across “If Many Worlds Had Come First” tonight while looking for something else, Yudkowsky’s “explanation” of Bell’s theorem. It’s a muddle of percentages that make my eyes glaze over, and quantum information theory is *my job.* Why he does it that way, I have no idea. Bell’s original argument from 1964 is actually *easier* to follow, and Yudkowsky name-drops the GHZ state, so he seems to be *aware* of more recent developments that made the point even simpler. Perhaps he wanted to make the mathematics “elementary”, but by not using (Mermin’s improvement of) the GHZ argument, he brings in needless trig functions and introduces a whole heap of angles that look completely arbitrary. It’s a mess.

What does Bell’s Theorem plus its experimental verification tell us, exactly?

My favorite phrasing is one I encountered in D. M. Appleby: “Quantum mechanics is inconsistent with the classical assumption that a measurement tells us about a property previously possessed by the system.”

OK. Let’s dig in. That paper isn’t about Bell’s theorem; it’s about the *Bell–Kochen–Specker theorem,* another result in the same area also proved by Bell (and independently by the team of Kochen and Specker). It has a similar upshot, but its assumptions are more abstract and harder to justify physically.

But it gets better. In his very next paper, Appleby writes,

If I am asked to accept Bohr as the authoritative voice of final truth, then I cannot assent. But if his writings are approached in a more flexible spirit, as a source of insights which are not the less seminal for being obscure, they suggest some interesting questions. I do not know if this line of thought will be fruitful. But I feel it is worth pursuing.

Not quite the message that Yudkowsky would want to convey. But it was there for him to read, written in 2004, years before LessWrong even existed.

I’ll admit, I probably wouldn’t have noticed that or gone on at length about it, were it not for the fact that Appleby is a collaborator of mine.

]]>J. J. Thomson: (points at atom) pudding

]]>