Wednesday, January 17, 2018

What to theorize when your theory's rejected

Sommerfeld and Bohr: ad hoc model builders rejecting Newtonian physics ... for action p dx ~ h (ca. 1919)
I was part of an epic Twitter thread yesterday, initially drawn in to a conversation about whether the word "mainstream" (vs "heterodox") was used in natural sciences (to which I said: not really, but the concept exists). There was one sub-thread that asked a question that is really more a history of science question (I am not a historian of science, so this is my own distillation of others' work as well a couple of my undergrad research papers). It began with Robert Waldmann tweeting to Simon Wren-Lewis:
... In natural sciences hypotheses don't survive statistically significant rejection as they do in economics.
Simon's response was:
They do if there is no alternative theory to explain them. The relevant question is what is an admissible theory.
To which both Robert and I said we couldn't think of any examples where this was the case. Simon Wren-Lewis then asks an interesting question about what happens when your theory starts meeting the headwind of empirical rejection:
How can that logically work[?] Do all empirical deviations from the (at the time) believed theory always come along at the same time as the theory that can explain those observations? Or in between do people stop doing anything that depends on the old theory?
The answer to the second question is generally "no". Some examples followed, but Twitter can't really do them justice. So I thought I'd write a blog post discussing some case studies in physics of what happens when your theory's rejected.

The Aether

The one case I thought might be an example where natural science didn't reject a theory (therefore making me qualify that there were no examples in post-war science) was the aether: the substance posited to be the medium in which light waves were oscillating. The truth was that this theory wasn't invented to make sense of any particular observations (Newton thought it explained diffraction), but rather to soothe the intuition of physicists (specifically Fresnel's, who invented the wave theory of light in the early 1800s). If light is a wave, it must be a wave in something, right? The aether was terribly stubborn for a physical theory in the Newtonian era. Some of the earliest issues arose with Fizeau's experiments in the 1850s. The "final straw" in the traditional story was the Michelson and Morely experiment, but experiments continued to test for the existence of "aether wind" for some years later (you could even call this 2009 precision test of Lorentz invariance a test of the aether). 

So here we have a case where a hypothesis was rejected and it was over 50 years between the first rejection and when the new theory "came along". What happened in the interim? Aether dragging. Actually the various experiments were considered confirmation of particular questions about how aether interacts with matter (even including Michelson and Morely's). 

But Fresnel's wave theory of light didn't really need the aether, and there was nothing that the aether did in Fresnel's theory besides exist as a medium for transverse waves. Funny enough, this is actually a problem because apparently aether didn't support longitudinal waves which makes it very different from any typical elastic medium. Looking back on it, it really doesn't make much sense to posit the aether. To me, that implies its role was solely to soothe the intuition; since we as physicists have long given up that intuition we can't really reconstruct how we would think about it at the time in much the same way we can't really imagine what writing looked like to us before we learned how to read.

So in this case study, we have a theory that was rejected and before the "correct" theory came along and physicists continued to use the "old theory". However, the problem with this as an example of Simon's contention is that the existence of the aether didn't have particular consequences for the descriptions of diffraction and polarization (the "old theory") for which it was invented. It was the connection between aether and matter that had consequences — in a sense, you could say this connection was assumed in order to be able to try and measure it. I can't remember the reference, but someone once wrote that the aether experiments seems to imply that nature was conspiring in such a way as to make the aether undetectable!

The Precession of Mercury

This case study brought up by Simon Wren-Lewis better represents what happens in natural sciences when data casts doubt on a theory. Precision analysis of astronomical data in the mid-1800s by Le Verrier led to one of the most high profile empirical errors of Newton's gravitational theory: it got the precession of Mercury wrong by several arc seconds per century. As Simon says: physicists continued to use Newton's "old" theory (and actually do so to this day) for nearly 50 years until the "correct" general theory of relativity came along.

But Newton's old theory was wildly successful (even the observed error was about 40 arc seconds per century). In one century, Mercury travels about 54 million seconds of arc meaning this error is on the order of one in one million. No economic theory is that accurate, so we could say that this case study is actually a massive case of false equivalence.

However, I think it is still useful to understand what happened in this case study. In our modern language, we would say that physicists set a scope condition (region of validity) based on a relevant scale in the problem: the radius of the sun (R). Basically, when the perihelion of the orbit r is large relative to R, other effects potentially enter. And at r/R ~ 2%, this ratio is much larger for Mercury than for any other planet (Mercury is in a 3:2 orbit resonance, tidally locked with the sun). Several ad hoc models of the sun's mass distribution (as well as other effects) were invented to try an account for the difference from Newton's theory (as mentioned by Robert). Eventually general relativity came along (setting a scale — the Schwarzchild radius 2 G M/c² — in terms of the strength of the gravitational field based on the sun's mass M and the speed of light, not its radius). Despite the how weird it was to think of the possibility of e.g. black holes or gravitational waves as fluctuations of space-time, the theory was quickly adopted because it fit the data.

The scale R set up a firewall preventing Mercury's precession from burning down the whole of Newtonian mechanics (which was otherwise fairly successful), and ad hoc theories were allowed to flourish on the other side of that firewall. This does not appear to happen in economics. As Noah Smith says:
I have not seen economists spend much time thinking about domains of applicability (what physicists usually call "scope conditions"). But it's an important topic to think about.
And as Simon says in his tweet, economists just go on using rejected theory elements and models without limiting its scope or opening the field to ad hoc models. This is also my own experience reading the economics literature.

Old Quantum Theory

Probably my favorite case study is so-called old quantum theory: the collection of ad hoc models that briefly flourished between Planck's quantum of light in 1900 to Heisenberg's quantum mechanics in 1925. Previously, lots of problems started to arise with Newtonian physics (though with the caveat that it was mostly wildly successful as mentioned above). There was the ultraviolet catastrophe (a singularity as wavelength goes to zero) which was related to blackbody radiation. Something was happening when the wavelength of light started to get close to the atomic scale. Until Planck posited the quantum of light, several ad hoc models including atomic motion were invented to give different functional forms for blackbody radiation in much the same way different models of the sun allowed for possible explanations of Mercury's precession.

In much the same way the radius of the sun set the scale for the firewall for gravity, Planck set the scale for what would become quantum effects by specifying a fundamental unit of action (energy × time or momentum × distance) now named after him: h. Old quantum theory set this up as a general principle by saying phase space integrals could only result integer multiples of h (Bohr-Sommerfeld quantization). Now h = 6.626 × 10⁻³⁴ J×s is tiny in terms of our human scale which is related to Newtonian physics being so accurate (and still used today); again using this as a case study for economics is another false equivalence as no economic theory is that accurate. But in the case, Newtonian physics was basically considered rejected within the scope of old quantum theory and stopped being used. That rejection was probably a reason why quantum mechanics was so quickly adopted (notwithstanding its issues with intuition that famously flustered Einstein and continue to this day). Quantum mechanics was invented in 1925, and by the 1940s physicists were working out renormalization of quantum field theories putting the last touches on a theory that is the most precise ever developed. Again, it didn't really matter how weird the theory seemed (especially at the time) because the only important criterion was fitting the empirical data.

There's another way this case study shows a difference between the natural sciences and economics. Old quantum theory was almost immediately dropped when quantum mechanics was developed, and ceased to be of interest except historically. Its one major success lives on in name only as the Bohr energy levels of Hydrogen. However, Paul Romer wrote about economic models using the Bohr model as an analogy for models like the Solow model that I've discussed before. Romer said:
Learning about models in physics–e.g. the Bohr model of the atom–exposes you to time-tested models that found a good balance between simplicity and insight about observables.
Where Romer sees a "balance between simplicity and insight" that might well be used if it were an economic model, this physicist sees a rejected model that's part of the history of thought in physics. Physicists do not learn the Bohr model (you learn of its existence, but not the theory). The Bohr energy level formula turned out to be correct, but today's undergraduate physics students derive it from quantum mechanics not "old quantum theory" using Bohr-Sommerfeld quantization.

A Summary

There is a general pattern where some empirical detail is at odds with a theory in physics:

  • A scale is set to firewall the empirically accurate pieces of the theory
  • A variety of ad hoc models are developed at that new scale where the only criterion is fitting the empirical data, no matter how weird they may seem

I submit that this is not how things work in economics, especially macroeconomics. Simon says we should keep using theories without a scope condition firewall, which Noah says doesn't seem to be thought about at all. New theories in macro- or micro-economics, no matter how weird, aren't judged based on their empirical accuracy alone.

But a bigger issue here I think is that there aren't any wildly successful [1] economic models. There really aren't any macroeconomic models accurate enough to warrant building a firewall. This should leave the field open to a great deal of ad hoc theorizing [2]. But in macro, you get DSGE models despite their poor track record. Unless you want to consider DSGE models to be ad hoc models that may go the way of old quantum theory! That's really my view: it's fine if you want to try DSGE model macro and it may well eventually lead to insight. But it really is an ad hoc framework operating in a field that hasn't set any scales because it hasn't had enough empirical success to require them.


Update 19 January 2018

Both Robert Waldmann and Simon Wren-Lewis responded to the tweet about this blog post (thread here) saying that physics is not the optimal natural science for comparison with economics. However, I disagree. Physics (and chemistry) are the only fields with a comparable level of mathematical formalism to economics. Other natural sciences use lots of math, too, but there is no over-arching formal mathematical way to solve a problem in e.g. biology (and some of the ones that do exist are based on either dynamical systems, the same kind of formalism used in economicsor even economic models). There's even less in medicine (Wren-Lewis's example).

Now you may argue that (macro)economics shouldn't have the level of mathematical formalism it does (I would definitely agree that the mathematical macro models used are far to complex to be supported by the limited data and that it's funny to write stuff like this). If you want to argue that macroeconomics shouldn't be using DSGE models, or that social science isn't amenable to math, go ahead [3]. But that wasn't the argument we were having which was what to do when your mathematical framework (e.g. standard DSGE models with Euler equations and Phillips curves) is rejected. Additionally, the reasons that these models are rejected are due to comparing the mathematical formalism with data — not their non-mathematical aspects. To that end, physics provides a best practice: set a scale and firewall off the empirically accurate parts of your theory.

Aside from the question of how one "uses" a non-mathematical model, one of the issues with the discussion of rejection of non-mathematical models is that there's no firm metric for rejection. When were Aristotle's crystal spheres rejected? Heliocentric models didn't really require rejection of the principle that planets were fixed to spheres made of aether. Kepler even mentions them in the same breath as the elliptical orbits that would reject the Aristotelian/Ptolemaic model completely, so comets and novae didn't reject the concept in Kepler's mind (you could make the case that the aether survives all the way to special relativity above). The "bad air" theory of disease around malaria (since it was associated with swampy areas, hence the name) was moderately successful up until a new theory came along in the sense that staying away from swamps or closing your windows is a good way to avoid mosquitoes.

Actually, it's possible the mathematical formalism is part of the reason macro doesn't just reject the models because of sunk costs (or 'regulatory capture') involved in learning the formalism. I don't know if non-mathematical models are more easily rejected in this sense (lower sunk costs), but I as I mentioned in my tweet as part of the thread linked above I couldn't even think of any non-mathematical models that were rejected that economics still uses — rendering the entire discussion moot if we're not talking about mathematical models.

PS I also added footnotes [2] and [3].



[1] Noah likes to tell a story about the prediction of the BART ridership using random utility discrete choice models (I mentioned here). One of the authors of that study has said that result was a bit of a fluke ("However, to some extent, we were right for the wrong reasons.").

[2] Added in update. This is part of my answer to Chris House's question (that I also address in my book): Why Are Physicists Drawn to Economics? Because it is a field that uses mathematical models and there are no real scope conditions known opening up the possibilities of any ad hoc model by physicists' standards.

[3] But you do have to contend with the fact that some of this non-mathematical social science is pretty empirically accurately described by mathematical models.


  1. I don't get why the amount of formal maths used is relevant here?

    Seems a bit scrabbling around for anything that'll make you right...

    1. You would have to go back to the original conversation on Twitter where it was understood that we were talking about rejecting mathematical models (e.g. DSGE, VARs, SEMs).

      If we're not talking about mathematical models being rejected by empirical data, then as I mention at the end of the update above, my whole post is moot (which could hardly be interpreted as "scrabbling around for anything"
      that would make me right). I haven't the foggiest idea of how you reject a non-mathematical model with numerical economic data like interest rates or GDP.

      You can reject models like "spontaneous generation" in biology, but I can't currently think of any *macro* models that are non-mathematical that have been rejected and are still being used by mainstream macro. I would be genuinely interested in knowing what those are.

  2. Regarding your update of Jan 18, 2018: Good stuff.

    Re: rejecting non-mathematical models: I don't know about economics, but the classic example is finding fossils of rabbits in the Jurassic layer (as being sufficient to reject evolution). Also you may want to check your last sentence. I read it a couple of times and I think it has one too many "I" in it and it might do with some parenthesis as well.

    1. There are even classic non-mathematical experiments that have actually rejected theories (e.g. "spontaneous generation" by Pasteur). I'm not questioning the existence or usefulness of non-mathematical experiments.

      But economics is really a field about human concepts more than observation of physical systems. Ostrom's work on managing commons isn't mathematical, but its also a policy prescription and not really something you "reject" but rather more like business management practices: some work better in some situations. But the fact that "six-sigma" fails to turn around your software company doesn't "reject" "six-sigma". In the same way, the existence of mismanaged commons does not reject Ostrom's theories. And most non-mathematical theories in economics are more like Ostrom than Pasteur.

      And then we have the additional problem of finding one of those non-mathematical theories that has been rejected and continues to be used. If we can't find such a theory, then Simon's and Robert's responses are just vacuous.


Comments are welcome. Please see the Moderation and comment policy.

Also, try to avoid the use of dollar signs as they interfere with my setup of mathjax. I left it set up that way because I think this is funny for an economics blog. You can use € or £ instead.