Friday, November 17, 2017

The "bottom up" inflation fallacy

Tony Yates has a nice succinct post from a couple of years ago about the "bottom up inflation fallacy" (brought up in my Twitter feed by Nick Rowe):
This inflation is caused by the sum of its parts problem rears its head every time new inflation data gets released. Where we can read that inflation was ’caused’ by the prices that went up, and inhibited by the prices that went down.
I wouldn't necessarily attribute the forces that make this fallacy a fallacy to the central bank as Tony does — at the very least, if central banks can control inflation, why are many countries (US, Japan, Canada) persistently undershooting their stated or implicit targets? But you don't really need a mechanism to understand this fallacy, because it's actually a fallacy of general reasoning. If we look at the components of inflation for the US (data from here), we can see various components rising and falling:


While the individual components move around a lot, the distribution remains roughly stable — except for the case of the 2008-9 recession (see more here). It's a bit easier to see the stability using some data from MIT's billion price project. We can think of the "stable" distribution as representing a macroeconomic equilibrium (and the recession being a non-equilibrium process). But even without that interpretation, the fact that an individual price moves still tells us almost nothing about the other prices in the distribution if that distribution is constant. And it's definitely not a causal explanation.

It does seem to us as humans that if there is something maintaining that distribution (central banks per Tony), then an excursion by one price (oil) is being offset by another (clothing) in order to maintain that distribution. However, there does not have to be any force acting to do so.

For example, if the distribution is a maximum entropy distribution then the distribution is maintained simply by the fact that it is the most likely distribution (consistent with constraints). In the same way it is unlikely that all the air molecules in your room will move to one side of it, it is just unlikely that all the prices will move in one direction — but they easily could. For molecules, that probability is tiny because there are huge numbers of them. For prices, that probability is not as negligible. In physics, the pseudo-force "causing" the molecules to maintain their distribution is called an entropic force. Molecules that make up a smell of cooking bacon will spread around a room in a way that looks like they're being pushed away from their source, but there is no force on the individual molecules making that happen. There is a macro pseudo-force (diffusion), but there is no micro force corresponding to it.

I've speculated that this general idea is involved in so-called sticky prices in macroeconomics. Macro mechanisms like Calvo prices are in fact just effective descriptions at the macro scale, and therefore studies that look at individual prices (e.g. Eichenbaum et al 2008) will not see stick prices.

In a sense, yes, macro inflation is due to the price movements of thousands of individual prices. And it is entirely possible that you could build a model where specific prices offset each other via causal forces. But you don't have to and there exist ways of constructing a model where there isn't necessarily any way to match up the macro inflation with specific individual changes because macro inflation is about the distribution of all price changes. That's why I say the "bottom up" fallacy is a fallacy of general reasoning, not just a fallacy according to the way economists understand inflation today: it assumes a peculiar model. And as Tony tells us, that's not a standard macroeconomic model (which is based on central banks setting e.g. inflation targets).

You can even take this a bit further and argue against the position that microfoundations are necessary for a macroeconomic model. It is entirely possible for macroeconomic forces to exist for which there are no microeconomic analogs. Sticky prices are a possibility; Phillips curves are another. In fact, even rational representative agents might not exist at the scale of human beings, but could be a perfectly plausible effective degrees of freedom at the macro scale (per Becker 1962 "Irrational Behavior and Economic Theory", which I use as the central theme in my book).

Thursday, November 16, 2017

Unemployment rate step response over time

One of the interesting effects I noticed in looking at the unemployment rate in early recessions with the dynamic equilibrium model was what looked like "overshooting" (step response "ringing" transients). For fun, I thought I'd try to model the recession responses using a simple "two pole" model (second order low pass system).

For example, here is the log-linear transformation of the unemployment rate that minimizes entropy:


If we zoom in on one of the recessions in the 1950s, we can fit it to the step response:


I then fit several more recessions. Transforming back to the original data representation (unemployment rate in percent), and compiling the results:


Overall, this was just a curve fitting exercise. However, what was interesting were the parameters over time. These graphs show the frequency parameter ⍵ and the damping parameter ζ:


Over time, the frequency falls and the damping increases. We can also show the damped frequency which is a particular combination of the two (this is the frequency that we'd actually estimate from looking directly at the oscillations in the plot):


With the exception of the 1970 recession, this shows a roughly constant fairly high frequency that falls after the 1980s to a lower roughly constant frequency.

At this point, this is just a series of observations. This model adds far too many parameters to really be informative (for e.g. forecasting). What is interesting is that the step response in physics results from a sharp shock hitting a system with a band-limited response (i.e. the system cannot support all the high frequencies present in the sharp shock). This would make sense — in order to support higher frequencies, you'd probably have to have people entering and leaving jobs at rates close to monthly or even weekly. While some people might take a job for a month and quit, they likely don't make up the bulk of the labor force. This doesn't really reveal any deep properties of the system, but it does show how unemployment might well behave like a natural process (contra many suggestions e.g. that it is definitively a social process that cannot be understood in terms of mindless atoms or mathematics).

Wednesday, November 15, 2017

New CPI data and forecast horizons

New CPI data is out, and here is the "headline" CPI model last updated a couple months ago:


I did change the error bar on the derivative data to show the 1-sigma errors instead of the median error in the last update. The level forecast still shows the 90% confidence for the parameter estimates. 

Now why wasn't I invited to this? One of the talks was on forecasting horizons:
How far can we forecast? Statistical tests of the predictive content
Presenter: Malte Knueppel(Bundesbank)
Coauthor: Jörg Breitung
A version of the talk appears here [pdf]. One of the measures they look at is year-over-year CPI, which according to their research seems to have a forecast horizon of 3 quarters — relative to a stationary ergodic process. The dynamic equilibrium model is approaching 4 quarters:


The thing is, however, the way the authors define whether the data is uninformative is relative to a "naïve forecast" that's constant. The dynamic equilibrium forecast does have a few shocks — one centered at 1977.7 associated with the demographic transition of women entering the workforce, and one centered at 2015.1 I've tentatively associated with baby boomers leaving the workforce [0] after the Great Recession (the one visible above) [1]. But for the period from the mid-90s after the 70s shock ends until the start of the Great Recession would in fact be this "naïve forecast":


The post-recession period does involve a non-trivial (i.e. not constant) forecast, so it could be "informative" in the sense of the authors above. We will see if it continues to be accurate beyond their forecast horizon. 

...

Footnotes

[0] Part of the reason for this shock to posited is its existence in other time series.

[1] In the model, there is a third significant negative shock centered at 1960.8 associated with a general slowdown in the prime age civilian labor force participation rate. I have no firm evidence of what caused this, but I'd speculate it could be about women leaving the workforce in the immediate post-war period (the 1950s-60s "nuclear family" presented in propaganda advertising) and/or the big increase in graduate school attendance.

Friday, November 10, 2017

Why k = 2?

I put up my macro and ensembles slides as a "Twitter talk" (Twalk™?) yesterday and it reminded me of something that has always bothered me since the early days of this blog: Why does the "quantity theory of money" follow from the information equilibrium relationship N M for information transfer index k = 2?

From the information equilibrium relationship, we can show log N ~ k log M and therefore log P ~ (k − 1) log M. This means that for k = 2 

log P ~ log M

That is to say the rate of inflation is equal to the rate of money growth for k = 2. Of course, this is only empirically true for high rates of inflation:


But why k = 2? It seems completely arbitrary. In fact, it is so arbitrary that we shouldn't really expect the high inflation limit to obey it. The information equilibrium model allows all positive values of k. Why does it choose k = 2? What is making it happen?

I do not have a really good reason. However, I do have some intuition.

One of the concepts in physics that the information equilibrium approach is related to is diffusion. In that case, most values of k represent "anomalous diffusion". But ordinary diffusion with a Wiener process (a random walk based on a normal distribution) results in diffusion where the distance traveled goes as the square root of the time step σ ~ √t. That square root arises from the normal distribution, which is in fact a universal distribution (there's a central limit theorem for distributions that converge to it). Another way: 

2 log σ ~ log t

is an information equilibrium relationship t σ with k = 2.

If we think of output as a diffusion process (distance is money, time is output), we can say that in the limit of a large number of steps, we obtain

2 log M ~ log N

as a diffusion process, which implies log P ~ log M.

Of course, there are some issues with this besides it being hand-waving. For one, output is the independent variable corresponding to time. This does not reproduce the usual intuition that money should be causing the inflation, but rather the reverse (the spread of molecules in diffusion is not causing time to go forward [1]). But then applying the intuition from a physical process to an economic one via an analogy is not always useful.

I tried to see if it came out of some assumptions about money M mediating between nominal output N and aggregate supply S, i.e. the relationship

N M S

But aside from figuring out that if the IT index k in the first half is k = 2 (per above), then the IT index k' for M S would have to be 1 + φ or 2 − φ where φ is the golden ratio in order for the equations to be consistent. The latter value k' = 2 − φ ≈ 0.38 implies that the IT index for N ⇄ S is k k' ≈ 0.76, while the former implies k k' ≈ 5.24. But that's not important right now. It doesn't tell us why k = 2.

Another place to look would be the symmetry properties of the information equilibrium relationship, but k = 2 doesn't seem to be anything special there.

I thought I'd blog about this because it gives you a bit of insight as to how physicists (or at least this particular physicist) tend to think about problems — as well as point out flaws (i.e. ad hoc nature) in the information equilibrium approach to the quantity theory of money/AD-AS model in the aforementioned slides. I'd also welcome any ideas in comments.

...

Footnotes:

[1] Added in update. You could make a case for the "thermodynamic arrow of time", in which case the increase in entropy is actually equivalent to "time going forward".

Interest rates and dynamic equilibrium

What if we combine an information equilibrium relationship A ⇄ B with a dynamic information equilibrium description of the inputs A and B? Say, the interest rate model (described here) with dynamic equilibrium for investment and the monetary base? Turns out that it's interesting:



The first graph is the long term (10-year) rate and the second is the short term (3 month secondary market) rate. Green is the information equilibrium model alone (i.e. the data as input), while the gray curves show the result if we use the dynamic equilibria for GPDI and AMBSL (or CURRSL) as input.

Here is the GPDI dynamic equilibrium description for completeness (the link above uses fixed private investment instead of gross private domestic investment which made for a better interest rate model):


Wednesday, November 8, 2017

A new Beveridge curve or, Science is Awesome

What follows is speculative, but it is also really cool. A tweet about how the unemployment rate would be higher if labor force participation was at its previous higher level intrigued me. Both the unemployment rate and labor force participation were pretty well described by the dynamic information equilibrium model. Additionally, if you have two variables obeying a dynamic equilibrium models, you end up with a Beveridge curve as the long run behavior if you plot them parametrically.

The first interesting discovery happened when I plotted out the two dynamic equilibrium models side by side:


The first thing to note is that the shocks to CLF [marked with red arrows, down for downward shocks, up for upward] are centered later, but are wider than the unemployment rate shocks [marked with green arrows]. This means that both shocks end up beginning at roughly the same time, but the CLF shock doesn't finish until later. In fact, this particular piece of information led me to notice that there was a small discrepancy in the data from 2015-2016 in the CLF model — there appears to be a small positive shock. A positive shock would be predicted by the positive shock to the unemployment rate in 2014! Sure enough, it turns out that adding a shock improves the agreement with the CLF data. Since the shock roughly coincides with the ending of the Great Recession shock, it would have otherwise been practically invisible.

Second, because the centers don't match up and the CLF shocks are wider, you need a really long period without a shock to observe a Beveridge curve. The shocks to vacancies and the unemployment rate are of comparable size and duration so that the Beveridge curve jumps right out. However the CLF/U Beveridge curve is practically invisible just looking at the data:


And without the dynamic equilibrium model, it would never be noticed because of a) the short periods between recessions, and b) the fact that most of the data before the 1990s contains a large demographic shock of women entering the workforce. This means that assuming there isn't another major demographic shock, a Beveridge curve-like relationship will appear in future data. You could count this as a prediction of the dynamic equilibrium model. As you can see, the curve is not terribly apparent in the post-1990s data (the dots represent the arrows in the earlier graph above):


[The gray lines indicate the "long run" relationship between the dynamic equilibria. The dotted lines indicate the behavior of data in the absence of shocks. As you can see, only small segments are unaffected by shocks (the 90s data at the beginning, and the 2017 data at the end).]

I thought the illumination of the small positive shock to CLF 2015-2016 as well as the prediction of a future Beveridge curve like relationship between CLF and U were fascinating. Of course, they're both speculative conclusions. But if this is correct, then the tweet that set this all off is talking about a counterfactual world that couldn't exist: if CLF was higher, then we either had a different series of recessions or the unemployment rate would be lower. That is to say we can't move straight up and down (choosing a CLF) in the graph above without moving side to side (changing U).

[Added some description of the graphs in edit 9 Nov 2017.]

...

Update 9 November 2017

Here are the differences between the original prime age CLF participation forecast and the new "2016-shock" version:



Tuesday, November 7, 2017

Presentation: forecasting with information equilibrium

I've put together a draft presentation on information equilibrium and forecasting after presenting it earlier today as a "twitter talk". A pdf is available for download from my Google Drive as well. Below the fold are the slide images.



JOLTS data out today

Nothing definitive with the latest data — just a continuation of a correlated negative deviation from the model trend. The last update was here.


I also tried a "leading edge" counterfactual (replacing the logistic function by an an exponential approximation for time t << y₀ where y₀ is the transition year which is somewhat agnostic about the amplitude of the shock) and made an animation adding the post-forecast data one point at a time:


Essentially we're in the same place we were with the last update. I also updated the Beveridge curve with the latest data points:


Friday, November 3, 2017

Checking my forecast performance: unemployment rate

Because more young adults are becoming unemployed on account of they can't find work. Basically, the problem is this: if you haven't got a job, then you’re outta work! And that means only one thing — unemployment!
The Young Ones (1982) “Demolition”
Actually, the latest unemployment rate data tells us it continues to fall as predicted by the dynamic information equilibrium model (conditional on the absence of shocks/recessions):




The first is the original prediction, the second is a comparison with various forecasts of the FRBSF, and the third is a comparison with two different Fed forecasts.

In trying to be fair to the FRBSF model, I didn't show the data from the before I made the graph as new post-forecast data (in black). However, in these versions of the graph I take all of the data from after the original forecast (in January) as new:



There also don't appear to be any signs of an oncoming shock yet; however the JOLTS data (in particular, hires) appears to be an earlier indication that the unemployment rate — by about 7 months. That is to say, we should see little in the unemployment rate until the recession is practically upon us (although the algorithm can still see it before it is declared or even widely believed to be happening).

Update + 2.5 hours

Also, here is the prime age civilian labor force participation rate:


Thursday, November 2, 2017

Chaos!

Like the weather, the economy is complicated.

Like the weather, the economy obeys the laws of physics.

Like the weather, the economy is aggregated from the motion of atoms.

Doyne Farmer only said the first one, but inasmuch as this is some kind of argument in favor of any particular model of the economy so are the other two. Sure, it's complicated. But that doesn't mean we can assume it is a complex system like weather without some sort of evidence. Farmer's post is mostly just a hand-waving argument that the economy might be a chaotic system. It's the kind of thing you write before starting down a particular research program path — the kind of thing you write for the suits when asking for funding.

But it doesn't really constitute evidence that the economy is a chaotic system. So when Farmer says:
So it is not surprising that simple chaos was not found in the data.  That does not mean that the economy is not chaotic.  It is very likely that it is and that chaos can explain the patterns we see.
The phrase "very likely" just represents a matter of opinion here. I say its "very likely" chaos is not going to be a useful way to understand macroeconomics. I have a Phd in physics and have studied economics for some time now, with several empirically successful models. So there.

To his credit, Farmer does note that the initial attempts to bring chaos to economics didn't pan out:
But economists looked for chaos in the data, didn’t find it, and the subject was dropped.  For a good review of what happened see Roger Farmer’s blog.
I went over Roger Farmer's excellent blog post, and added to the argument in my post here.

Anyway, I have several issues with Doyne Farmer's blog post besides the usual "don't tell us chaos is important, show us" via some empirical results. In the following, I'll excerpt a few quotes and discuss them. First, Farmer takes on a classic econ critic target — the four-letter word DSGE:
Most of the  Dynamic Stochastic General Equilibrium (DSGE) models that are standard in macroeconomics rule out chaos from the outset.
To be fair, it is the log-linearization of DSGE models that "rules out" chaos, but then it only "rules out" chaos in regions of state space that are out of scope for the log-linearized versions of DSGE models. So when Farmer says:
Linear models cannot display chaos – their only possible attractor is a fixed point.
it comes really close to a bait and switch. An attractor is a property of the entire state space (phase space) of the model; the log-linearization of DSGE models is a description valid (in scope) for a small region of phase space. In a sense, Farmer is extending the log-linearization of a DSGE model to the entire state space. 

However, Eggertsson and Singh show that the log-linearization doesn't actually change the results very much — even up to extreme events like the Great Depression. This is because in general most of the relevant economic phenomena we observe appear to be perturbations: recessions impact GDP by ~ 10%, high unemployment is ~ 10%. In a sense, observed economic reality tells us that we don't really stray far enough away from a local log-linearization to tell the difference between a linear model and a non-linear one capable of exhibiting chaos. This is basically the phase space version of the argument Roger Farmer makes in his blog post that we just don't have enough data (i.e. we haven't explored enough of the phase space).

The thing is that a typical nonlinear model that can exhibit chaos (say, the predator-prey model defined by the Lotka–Volterra equations) has massive fluctuations. The chaos is not a perturbation to some underlying bulk, but is rather visiting the entire phase space. You could almost take that as a definition of chaos: a system that visits a large fraction of the potential phase space. This can be seen as a consequence of the "butterfly effect": two initial conditions in phase space become separated by exponentially larger distances over time. Two copies of the US economy that were "pretty close" to start would evolve to be wildly different from each other — e.g. their GDPs would become exponentially different. Now this is entirely possible, but the difference in GDP growth rates would probably be only a percentage point or two at best which would take a generation to become exponentially separated. Again, this is just another version of Roger Farmer's argument that we don't have long enough data series.

Another way to think of this is that the non-trivial attractors of a chaotic system visit some extended region of state space, so you'd imagine that a general chaotic model would produce large fluctuations in its outputs representative of the attractor's extent in phase space. For example, Steve Keen's dynamical systems exhibit massive fluctuations compared to those observed.

Now this in no way rules out the possibility that macroeconomic observables can be described by a chaotic model. It is just an argument that a chaotic model that produces the ~ 10% fluctuations actually observed would have to result from either some fine tuning or a bulk underlying equilibrium [1].

In a sense, Farmer seems to cedes all of these points at the end of his blog post:
In a future blog post I will argue that an important part of the problem is the assumption of equilibrium itself.  While it is possible for an economic equilibrium to be chaotic, I conjecture that the conditions that define economic equilibrium – that outcomes match expectations – tend to suppress chaos.
It is a bit funny to begin a post talking up chaos only to downplay it at the end.  I will await this future blog post, but this seems to be saying that we don't see obvious chaos (with its typical large fluctuations) because chaos is suppressed via some bulk underlying equilibrium (outcomes match expectations) — so that we essentially need longer data series to extract the chaotic signal.

But then after building us up with a metaphor using weather which is notoriously unpredictable, Farmer says:
Ironically, if business cycles are chaotic, we have a chance to predict them.
Like the weather, the economy is predictable.

??!

Now don't take this all as a reason not to study chaotic dynamical systems as possible models of the economy. At best, it represents a reason I chose not to study chaotic dynamical systems as possible models of the economy. I think it's going to be a fruitless research program. But then again, I originally wanted to work in fusion and plasma physics research.

Which is to say arguing in favor of one research program or another based on theoretical considerations tends to be more philosophy than science. Farmer can argue in favor of studying chaotic dynamics as a model of the economy. David Sloan Wilson can argue in favor of biological evolution. It's a remarkable coincidence that both of these scientists see the macroeconomy not as economics, but rather as a system best described using their own field of study they've worked in for years [2].

What would be useful is if Farmer or Wilson just showed how their approaches lead to models that better described the empirical data. That's the approach I take on this blog. One plot or table describing empirical data is worth a thousand posts about how one intellectual thinks the economy should be described. In fact, how this scientist or that economist thinks the economy should be properly modeled is no better than how a random person on the internet thinks the economy should be properly modeled without some sort empirical evidence backing it up. Without empirical evidence, science is just philosophy.

...

PS

I found this line out of place:
Remarkably a standard family of models is called “Real business cycle models”, a clear example of Orwellian newspeak.
Does Farmer not know that "real" here means "not nominal"? I imagine this is just a political jab as a chaotic model could easily be locally approximated by an RBC model.

...

Footnotes

[1] For example NGDP ~ exp(n t) (1 + d(t)) where the leading order growth "equilibrium" is given by exp(n t) while the chaotic component is some kind of business cycle function |d(t)| << 1.

[2] Isn't that what I'm doing? Not really. My thesis was about quarks.

Tuesday, October 31, 2017

Can a macro model be good for policy, but not for forecasting?

During the course of an ongoing debate with Britonomist sparked by my earlier post, he argued that a model that isn't designed for forecasting shouldn't be used to forecast. But if it's not good for forecasting, it can't be good for e.g. policy changes either. In fact, without some really robust reasoning or empirical validation a model that can't be used to forecast can't be used for anything.

Now Britonomist is in good company as Olivier Blanchard has said almost the exact same thing he said in his blog post on five classes of macro models where Blanchard separated policy models from forecasting models. But I think that's evidence of some big fundamental issues in economic methodology. As I'm going to direct this post at a general audience, please don't take me being pedantic as talking down to Econ — on my own, I'd probably write this post entirely in terms of differential geometry on high dimensional manifolds (which is what the following argument really is).

The key assumption underlying the ability to forecast or determine the effects of policy changes is the ceteris paribus assumption. The basic idea is that the stuff you don't include is either negligible or cancels out on average. It's basically a more specific form of Hume's uniformity of nature. Let's think of the ceteris paribus condition as a constraint on the ocean underneath a boat. The boat is our macro model. You can think of the un-modeled variables as the curvature of the Earth (the variables that vary slowly) and the waves on the ocean (the variables that average out). [After comment from Britonomist on Twitter, I'd like to add that these are just examples. There can also be "currents" — i.e. things that aren't negligible — in the ceteris paribus condition. In physics, we'd consider these variables as ones that vary slowly (i.e. the current is roughly the same a little bit to the west or south), but people may take "vary slowly" to mean aren't very strong which is not what I mean here.]

Here's a picture:


The ceteris paribus condition lets us move the boat to the west and say that the sea is similar there so our boat still floats (our model still works). We only really know that our boat floats in an epsilon (ε) sized disk near where we started, so checking that it floats when we move west (i.e. comparison to data) is still important.

Now let's say east-west is time. Moving west with the boat is forecasting with the model. Moving north and south is changing policy. We move south, we change some collection of policy parameters in, say, the Taylor rule for a concrete example. The key takeaway here is that the ceteris paribus condition under the boat is the same for both directions we take the model: policy or forecasting.


It's true that the seas could get rougher to the south or a storm might be off in the distance to the northwest which may limit the distance we can travel in a particular direction under a given ceteris paribus scenario. But if we can't take our model to the west even a short distance, then we can't take our model to the south even a short distance. A model that can't forecast can't be used for policy [1]. This is because our ceteris paribus condition must coincide for both directions at the model origin. If it's good enough for policy variations, then it's good enough for temporal variations. Saying it's not good enough means knowing a lot about the omitted variables and their behavior [2].

And the truth is that looking at forecasts and policy changes are usually not orthogonal. You look at the effect of policy changes over a period of time. You're usually heading at least a little northwest or southwest instead of due south or due north.

But additionally there is the converse: if your model can't forecast, then it's probably useless for policy as well unless that manifold has the weird properties I describe in footnote [1]. Another way to put this is that saying a model can be used for policy changes but not forecasting implies an unnaturally large (or small) scale defined by the ratio of policy parameter changes to temporal changes [3]. Movement in time is somehow a much bigger step than movement in parameter space.

Now it is entirely possible this is actually the way things are! But there had better be really good reasons (such as really good agreement with the empirical data). Nice examples where this is true in physics are phase transitions. Sometimes a small change in parameters (or temperature) leads to a large qualitative change in model output (water freezes instead of getting colder). Effectively saying a macroeconomic model that can be used for policy but not forecasting is saying there's something like a phase transition for small perturbations of temperature.

This all falls under the heading of scope conditions. Until we collect empirical data from different parts of the ocean and see if our boat floats or sinks, we only really know about an "epsilon-sized ball" near the origin (per Noah Smith at the link). Empirical success gives us information about how big epsilon is. Or — if our theory is derived from a empirically successful framework — we can explicitly derive scope conditions (e.g. we can show using relativity that Newtonian physics is in scope for v << c). However, claims that a macro model is good for policy but not forecasting is essentially a nontrivial claim about model scope that needs to be much more rigorous than "it's POSSIBLE" (in reference to my earlier post on John Cochrane being unscientific), "it's not actually falsified", "it's just a toy model", or "it makes sense to this one economist".

And this is where both Blanchard and Britonomist are being unscientific. You can't really have a model that's good for policy but not forecasting without a lot of empirical validation. And since empirical validation is hard to come by in macro, there's no robust reason to say a model is good for one thing and not another. As Britonomist says, sometimes some logical argument is better than nothing in the face of uncertainty. People frequently find as much comfort in the pretense of knowledge as in actual knowledge. But while grasping at theories without empirical validation is sometimes helpful (lots of Star Trek episodes require Captain Picard to make decisions based on unconfirmed theories, for example), it is just an example of being decisive, not being scientific [4].

...

Footnotes

[1] This is where the differential geometry comes in handy. Saying a model can be used for policy changes (dp where p is the vector of parameters) but not for forecasting (dt where t is time) implies some pretty strange properties for the manifold the model defines (with its different parameters and at different times). In particular, it's "smooth" in one direction and not another.

Another way to think of this is that time is just another parameter as far as a mathematical model is concerned and we're really looking at variations dp' with p' = (p, t).

[2] Which can happen when you're working in a well-defined and empirically validated theoretical framework (and your model is some kind of expansion where you take only leading order terms in time changes but, say, up to second order terms in parameter changes). This implies you know the scale relating temporal and parameter changes I mention later in the post.

[3] |dp| = k dt with k >> 1. The scale is k and 1/k is some unnatural time scale that is extremely short for some reason. In this "unnatural" model, I can apparently e.g. double the marginal propensity to consume but not take a time step a quarter ahead.

[4] As a side note, the political pressure to be decisive runs counter to being scientific. Science deals with uncertainties by creating and testing multiple hypotheses (or simply accepting the uncertainty). Politics deals with uncertainty by choosing a particular action. That is a source of bad scientific methodology in economics where models are used to draw conclusions where the scientific response would be to claim "we don't know".

Monday, October 30, 2017

Another forecast compared to data

New NGDP data is out, so I've added the latest points to see how this forecast of NGDP per employed person (FRED series GDP over PAYEMS) is doing:



Forecast head-to-head performance (and looking back)


Here's the latest data on core PCE inflation compared to my information equilibrium model (single factor production monetary model) along with the FRB NY DSGE model forecast (as well as the FOMC forecast) of the same vintage:



The IE model is biased a bit low by almost as much as the DSGE model is biased high (but the former is a far simpler model than the latter). I've been tracking this forecast since 2014 (over three years now). There's now only one more quarter of data left. What's probably most interesting is how I've changed in looking at this model. If I were writing it down today, I'd define the model as I do in the parenthetical: a single factor production model with "money" as the factor of production. In particular, I'd take the ensemble approach, considering a set of markets that turn "money" $M$ into various outputs ($N_{i}$):

$$
N_{i} \rightleftarrows M
$$

(notation definition here) such that (assuming $\langle k \rangle$ is slowly varying)

$$
\langle N \rangle \approx N_{0} \left( \frac{M}{M_{0}} \right)^{\langle k \rangle}
$$

with the ansatz (consistent with slowly varying $\langle k \rangle$)

$$
\langle k \rangle \equiv \frac{\log \langle N \rangle/C_{0}}{\log M/C_{0}}
$$

and therefore the price level is given by

$$
\langle P \rangle \approx \frac{N_{0}}{M_{0}} \langle k \rangle \left( \frac{M}{M_{0}} \right)^{\langle k \rangle - 1}
$$

The parameters here are given by

$$
\begin{align}
\frac{N_{0}}{M_{0}} = & 0.661\\
C_{0} = & 0.172  \;\text{G\$}\\
M_{0} = & 595.1 \;\text{G\$}
\end{align}
$$

for the case of the core PCE price level and using the monetary base minus reserves as "money".

Saturday, October 28, 2017

Corporate taxes and unscientific economists

I've been watching this ongoing "debate" among Brad DeLong, John Cochrane, and Greg Mankiw (and others, but to get started see here, here, here, and here). It started out with Mankiw putting up a "simple model" of how corporate tax cuts raise wages that he first left as an exercise to the reader, and then updated his post with a solution. The solution Mankiw finds is remarkably simple. In fact, it's too remarkably simple. And Mankiw shows some of the inklings of being an actual scientist when he says:
I must confess that I am amazed at how simply this turns out. In particular, I do not have much intuition for why, for example, the answer does not depend on the production function.
Cochrane isn't troubled, though:
The example is gorgeous, because all the production function parameters drop out. Usually you have to calibrate things like the parameter α [the production function exponent] and then argue about that.
The thing is that in this model, you should be at least a bit troubled [1]. The corporate tax base is equal to the marginal productivity of capital df/dk (based on the production function f(k)) multiplied by capital k i.e. k f'(k). Somehow the effect on wages of a corporate tax cut doesn't depend on how the corporate tax base is created?

But let's take this result at face value. So now we have a largely model-independent finding that to first order the effect of corporate tax cuts is increased wages. The scientific thing to do is not to continue arguing about the model, but to in fact compare the result to data. What should we expect? We should a large change in aggregate wages when there are changes in corporate tax rates — in either direction. Therefore the corporate tax increases in the 1993 tax law should have lead to falling wages, and the big cut in corporate tax rates in the 80s should have lead to even larger increase in wages. However, we see almost no sign of any big effects in the wage data:


The only large positive effect on wages seems to have come in the 70s during the demographic shift of women entering the workforce, and the only large negative effect is associated with the Great Recession. Every other fluctuation appears transient.

Now you may say: Hey, there are lots of other factors at play so you might not see the effect in wage data. This is the classic "chameleon model" of Paul Pfliederer: we trust the model enough to say it leads to big wage increases, but when they don't appear in the data we turn around and say it's just a toy model.

The bigger issue, however, is that because this is a model-independent finding at first order, we should see a large signal in the data. Any signal that is buried in noisy data or swamped by other effects is obviously not a model-independent finding at first order, but rather a model-dependent finding at sub-leading order.

This is where Cochrane and Mankiw are failing to be scientists. They're not "leaning over backwards" to check this result against various possibilities. They're not exhibiting "utter honesty". Could you imagine either Cochrane or Mankiw blogging about this if the result had come out the other way (i.e. zero or negative effect on wages)? It seems publication probability is quite dependent on the answer. Additionally, neither address [2] the blatant fact that both are pro-business Republicans (Mankiw served in a Republican administration, Cochrane is part of the Hoover institution), and that the result they came up with is remarkably good public relations for corporate tax cuts [3]. Cochrane is exhibiting an almost comical level of projection when he calls out liberal economists for being biased [4].

But the responses of DeLong [5] and Krugman are also unscientific: focusing on the mathematics and models instead of incorporating the broader evidence and comparing the result to data. They are providing some of the leaning over backwards that Cochrane and Mankiw should be engaged in, but overall are accepting the model put forward at face value despite it lacking any demonstrated empirical validity. In a sense, the first response should be that the model hasn't been empirically validated and so represents a mathematical flight of fancy. Instead they engage in Mankiw's and Cochrane's version of Freddy Krueger's dreamworld of the neoclassical growth model.

And this is the problem with economics — because what if Mankiw's and Cochrane's derivations and definitions of "static" analysis were mathematically and semantically correct? Would they just say I guess you're right — corporate tax cuts do raise wages. Probably not. They'd probably argue on some other tack, much like how Cochrane and Mankiw would argue on a different tack (in fact, probably every possible tack). This is what happens when models aren't compared to data and aren't rejected when the results are shown to be at best inconclusive.

Data is the great equalizer in science as far as models go. Without data, it's all just a bunch of mansplaining.

...

Update 10 Oct 2017: See John Cochrane's response below, as well as my reply. I also added some links I forgot to include and corrected a couple typos.

...

Footnotes:

[1] In physics, you sometimes do obtain this kind of result, but the reason is usually topological (e.g. Berry phase, which was a fun experiment I did as an undergraduate) or due to universality.

[2] I freely admit I am effectively a Marxist at this point in my life, so I would likely be biased against corporate tax cuts being good for labor. However my argument above leaves open the possibility that corporate tax cuts do lead to higher wages, just not at leading order in a model-independent way.

[3] It's actually odd that corporations would push for corporate tax cuts if their leading effect was to raise wages (and not e.g. increase payouts to shareholders), all the while pushing against minimum wage increases.

[4] In fact, DeLong and Krugman are usually among the first to question "too good to be true" economic results from the left (even acquiring a reputation as "neoliberal shills" for it).

[5] At least DeLong points out that Mankiw should be troubled by the lack of dependence of the result on the production function.

Thursday, October 26, 2017

Investment

Dietz Vollrath examines a paper by German Gutierrez and Thomas Philippon about "Investment-less Growth" asking "Where did all the investment go?" The question that's actually being asked (since FPI and even RFPI are in fact at all time highs) is why investment is so low relative to profits. In the end Gutierrez and Philippon connect at least in part to a complex function of market power (that I've discussed before).

The question I'd like to ask is what baseline should we be looking at? Investment is very volatile (fluctuating strongly through the business cycle), but there appears to be an underlying dynamic equilibrium — much like what happens in the previous couple of posts on real growth and wage growth:


Fixed private investment (FPI) is deflated using the GDP deflator like in the real growth post.

In this picture, we seem to have exactly the same picture we have with NGDP with two major shocks. We could potential describe both shocks in terms of the same demographic effects: women entering the workforce and Baby Boomers leaving/retiring after the Great Recession:


In fact, to a good approximation NGDP ~ FPI (a fact that I use in the information equilibrium version of the IS-LM model):


The above graph shows the FPI model scaled to match NGDP. The transitions (vertical lines) happen in roughly the same place (FPI's transitions are much narrower, however).

So is investment really a different metric from NGDP? Are the reasons FPI is what it is today different from the reasons NGDP is what it is today? Or is investment just given by something proportional to NGDP with an exaggerated business cycle? This doesn't conclusively answer this question, but it does act as a bit of an Occam's Razor: G&P's paper is quite a robust work at 94 pages long but "long run investment is proportional to GDP" captures the data more accurately with many fewer parameters (and no assumptions about e.g. firm behavior).

Tuesday, October 24, 2017

Wage growth

Sandy Black:
Nice discussion of why, since the 1970s, hourly inflation-adjusted wages of the typical worker have grown only 0.2% per year.
The linked article does "discuss" stagnant wages, but doesn't really say why. Their description of what makes wages grow is in fact just a series of mathematical definitions written as prose:
For wages to grow on a sustained basis, workers’ productivity must rise, meaning they must steadily produce more per hour, often with the help of new technology or capital. Further, workers must receive a consistent share of those productivity gains, rather than seeing their share decline. Finally, for the typical worker to see a raise, it is important that workers’ gains are spread across the income distribution.
The first part is a description of the Solow growth model with constant returns to scale, and a fixed exponent. The last bit is just tautological: a typical worker sees gains if and only if those gains are shared across the income distribution. They might as well have said that for a typical worker to see a raise, it is important that the typical worker sees a raise.

But then  the article just whiffs on any substantive explanation, telling us what "decline is plausibly due to" an that "[a]ssigning relative responsibility to the policies and economic forces that underlie rising inequality or declining labor share is a challenge." It talks about "productivity" and "dynamism", which are more quantifying the issues we are seeing than explaining them. While it is generally useful to quantify things, one must avoid creating measures of phlogiston or attributing causality to quantities you defined [2].

Anyway, I looked at hourly wages a couple months ago with the dynamic equilibrium model and found that the dynamic equilibrium growth rate was about 2.3% — therefore with 2% inflation, the real growth rate would be about 0.3%:


In this picture, wages going forward will continue to be "stagnant" because that is their normal state. The high wages of the past were closely linked with the demographic transition.

But this got me interested in the different possible ways to frame the data. Because we don't really have much equilibrium data (most of the post-war period is dominated by a major demographic shift), there's a bit of ambiguity. In particular, I decided to look at wages per employed person. I will deflate using the GDP deflator later (following this post, except with NGDP exchanged for W/L), but first look at these two possible dynamic information equilibria:


These show the data with a given dynamic equilibrium growth rate subtracted. One sees two transitions: women entering the workforce and baby boomers leaving it after the Great Recession (growth rate = 3.6%). The other sees just the single demographic transition (growth rate = 2.0%). These result in two different equilibria when deflated with the GDP deflator — dynamic equilibrium growth rates of 2.2% and 0.6%, respectively:


We can see that the 2.2% dynamic equilibrium is a better model:


The two models give us two different views of the future. In one, wages are at their equilibrium and will only grow slowly at about 0.6%/y in the future (unless e.g. another demographic shock hits). In the other (IMHO, better) model, wages growth will increase in the near future from about 1%/y to 2.2%/y. However both models point to the ending of the demographic transition (and the "Phillips curve era") in the 90s as a key component of why today is different from the 1970s, therefore (along with the other model above) we can take that conclusion to be more robust.

As for future wage growth? There isn't enough data to paint a definitive picture. Maybe wage growth will have to rely on asset bubbles (the first model at the top of this post)? Maybe wage growth will continue to stagnate? Maybe wage growth will happen after we leave this period of Baby Boomer retirements?

My own intuition says a combination of 1 (because they do in both the hourly wage and W/L models [1]) and 3 (because it is the better overall model of W/L).

Update

Here's a zoom-in on the W/L model for tracking forecast performance:


...

Footnotes:

[1] In fact, you can see the asset bubbles affecting W/L on the lower edge of the graph shown above and again here — there are two bumps associated with the dot-com and housing bubbles:


[2] Added in update: I wanted to expound on this a bit more. The issue is that when you define things, you have a tendency to look for things that show an effect creating a kind of selection bias. "Dynamism" becomes important because you look at falling wages and look for other measures that are falling, and lump them under a new concept you call "dynamism". This is similar to an issue I pointed out some time ago that I'll call " = 0.7 disease". If you were to design an index you wanted to call "dynamism" that combined various measures together, you might end up including or leaving out things that correlate with some observable (here: wages) depending on whether or not you thought they improved the correlation. Nearly all economic variables are correlated with the business cycle or are exponentially growing quantities so you usually start with some high , and this process seems to stop before your R² gets too low. I seem to see a lot of graphs of indices out there with correlations on the order of 0.8 (resulting in an  of about 0.6-0.7):


The issue with defining factors like productivity or dynamism is similar to the " = 0.7 disease": since you defined it, it's probably going to have a strong correlation with whatever it is you're trying to explain.