Tuesday, September 19, 2017

Information, real, nominal, and Solow


John Handley has the issue that anyone with a critical eye has about the Solow model's standard production function and total factor productivity: it doesn't make sense once you start to compare countries or think about what the numbers actually mean.

My feeling is that the answer to this in economic academia is a combination of "it's a simple model, so it's not supposed to stand up to scrutiny" and "it works well enough for an economic model". Dietz Vollrath essentially makes the latter point (i.e. the Kaldor facts, which can be used to define the Solow production function, aren't rejected) in his pair of posts on the balanced growth path that I discuss in this post.

I think the Solow production function represents an excellent example of how biased thinking can lead you down the wrong path; I will attempt to illustrate the implicit assumptions about production that go into its formulation. This thinking leads to the invention of "total factor productivity" to account for the fact that the straitjacket we applied to the production function (for the purpose of explaining growth, by the way) makes it unable to explain growth.

Let's start with the last constraint applied to the Cobb-Douglas production function: constant returns to scale. Solow doesn't really explain it so much as assert it in his paper:
Output is to be understood as net output after making good the depreciation of capital. About production all we will say at the moment is that it shows constant returns to scale. Hence the production function is homogeneous of first degree. This amounts to assuming that there is no scarce nonaugmentable resource like land. Constant returns to scale seems the natural assumption to make in a theory of growth.
Solow (1956)
But constant returns to scale is frequently justified by "replication arguments": if you double the factory machines (capital) and the people working them (labor), you double output.

Already there's a bit of a 19th century mindset going in here: constant returns to scale might be true to a decent approximation for drilling holes in pieces of wood with drill presses. But it is not obviously true of computers and employees: after a certain point, you need an IT department to handle network traffic, bulk data storage, and software interactions.

But another reason we think constant returns to scale is a good assumption also involves a bit of competition: firms that have decreasing returns to scale must be doing something wrong (e.g. poor management), and therefore will lose out in competition. Increasing returns to scale is a kind of "free lunch" that also shouldn't happen.

Underlying this, however, is that Solow's Cobb-Douglas production function is thought of in real terms: actual people drilling actual holes in actual pieces of wood with actual drill presses. But capital is bought, labor is paid, and output is sold with nominal dollars. While some firms might adjust for inflation in their forecasts, I am certain that not every firm makes production decisions in real terms. In a sense, in order to have a production function in terms of real quantities, we must also have rational agents computing the real value of labor and capital, adjusting for inflation.

The striking thing is that if we instead think in nominal terms, the argument about constant returns to scale falls apart. If you double the nominal value of capital (going from 1 million dollars worth of drill presses to 2 million) and the labor force, there is no particular reason that nominal output has to double.

Going from 1 million dollars worth of drill presses to 2 million dollars at constant prices means doubling the number of drill presses. If done in a short enough period of time, inflation doesn't matter. In that case, there is no difference in the replication argument using real or nominal quantities. But here's where we see another implicit assumption required to keep the constant returns to scale — there is a difference over a long period of time. Buying 1000 drill presses in 1957 is very different from buying 1000 drill presses in 2017, so doubling them over 60 years is different. But that brings us back to constant returns to scale: constant returns to scale tells us that doubling in 1957 and doubling over 60 years both result in doubled output.

That's where part of the "productivity" is supposed to come in: 1957 vintage drill presses aren't as productive as 2017 drill presses given the same labor input. Therefore we account for these "technological microfoundations" (that seem to be more about engineering than economics) in terms of a growing factor productivity. What comes next requires a bit of math. Let's log-linearize and assume everything grows approximately exponentially (with growth rates $\eta$, $\lambda$, and $\kappa$). Constant returns to scale with constant productivity tells us:

$$
\eta = (1-\alpha) \lambda + \alpha \kappa
$$

Let's add an exponentially growing factor productivity to capital:

$$
\eta = (1-\alpha) \lambda + \alpha (\kappa + \pi_{\kappa})
$$

Now let's re-arrange:

$$
\begin{align}
eta = & (1-\alpha) \lambda + \alpha \kappa (1 + \pi_{\kappa}/\kappa)\\
= & (1-\alpha) \lambda + \alpha' \kappa
\end{align}
$$

Note that $1 - \alpha + \alpha' \neq 1$. We've now just escaped the straitjacket of constant returns to scale by adding a factor productivity we had to add in order to fit data because of our assumption of constant returns to scale. Since we've given up on constant returns to scale when including productivity, why not just add total factor productivity back in:

$$
\eta = \pi_{TFP} + (1-\alpha) \lambda + \alpha \kappa
$$

let's arbitrarily partition TFP to labor and capital (in half, but doesn't matter). Analogous to the capital productivity, we obtain:

$$
\eta = \beta' \lambda + \alpha' \kappa
$$

with

$$
\begin{align}
\beta' = & (1 - \alpha) + \pi_{TFP}/(2 \lambda)\\
\alpha' = & \alpha + \pi_{TFP}/(2 \kappa)
\end{align}
$$

In fact, if we ignore constant returns to scale and allow nominal quantities (because you're no longer talking about the constant returns to scale of real quantities) you actually get a pretty good fit with constant total factor productivity [1]:



Now I didn't just come by this because I carefully thought out the argument against using constant returns to scale and real quantities. Before I changed my paradigm, I was as insistent on using real quantities as any economist.

What changed was that I tried looking at economics using information theory. In that framework, economic forces are about communicating signals and matching supply and demand in transactions. People do not buy things in real dollars, but in nominal dollars. Therefore those signals are going to be in terms of nominal quantities.

And this comes to our final assumption underlying the Solow model: that nominal price and the real value of a good are separable. Transforming nominal variables into real ones by dividing by the price level is taken for granted. It's true that you can always mathematically assert:

$$
N = PY
$$

But if $N$ represents a quantity of information about the goods and services consumed in a particular year, and $Y$ is supposed to represent effectively that same information (since nominal and real don't matter at a single point in time), can we really just divide $N$ by $P$? Can I separate the five dollars from the cup of coffee in a five dollar cup of coffee? The transaction events of purchasing coffee happen in units of five dollar cups of coffee. At another time, they happened in one dollar cups of coffee. But asserting that the "five dollar" and the "one dollar" can be removed such that we can just talk about cups of coffee (or rather "one 1980 dollar cups of coffee") is saying something about where the information resides: using real quantities tell us its in the cups of coffee, not the five dollars or in the holistic transaction of a five dollar cup of coffee.

Underlying this is an assumption about what inflation is: if the nominal price is separable, then inflation is just the growth of a generic scale factor. Coffee costs five dollars today rather than the one dollar it cost in 1980 because prices rose by about a factor of five. And those prices rise for some reason unrelated to cups of coffee (perhaps monetary policy). This might make some sense for an individual good like a cup of coffee, but it is nonsense for an entire economy. GDP is 18 trillion dollars today rather than 3 trillion in 1980 because while the economy grew by factor of 2, prices grew by a factor of 3 for reasons unrelated to the economy growing by a factor of 2 or changes in the goods actually produced?

To put this mathematically, when we assume the price is separable, we assume that we don't have a scenario where [2]

$$
N = P(Y) Y(P)
$$

because in that case, the separation doesn't make any sense.

One thing I'd like to emphasize is that I'm not saying these assumptions are wrong, but rather that they are assumptions — assumptions that didn't have to be made, or made in a particular way.

The end result of all these assumptions — assumptions about rational agents, about the nature of inflation, about where the information resides in a transaction, about constant returns to scale — led us down a path towards a production function where we have to invent a new mysterious fudge factor we call total factor productivity in order to match data. And it's a fudge factor that essentially undoes all the work being done by those assumptions. And it's for no reason because the Cobb-Douglas production function, which originally didn't have a varying TFP by the way, does a fine job with nominal quantities, increasing returns to scale, and a constant TFP as shown above.

It's one of the more maddening things I've encountered in my time as a dilettante in economic theory. Incidentally, this all started when Handley asked me on Twitter what the information equilibrium approach had to say about growth economics. Solow represents a starting point of growth economics, so I feel a bit like Danny in The Shining approaching the rest of the field:


...

Update #1: 20 September 2017

John Handley has a response:
Namely, [Smith] questions the attachment to constant returns to scale in the Solow model, which made me realize (or at least clarified my thinking about the fact that) growth theory is really all about increasing returns to scale.
That's partially why it is so maddening to me. Contra Solow's claim that constant returns to scale is "the natural assumption", it is in fact the most unnatural assumption to make in a theory of economic growth.

Update #2: 20 September 2017

Sri brings up a great point on Twitter from the history of economics — that this post touches on the so-called "index number problem" and the "Cambridge capital controversy". I actually have posts on resolving both using information equilibrium (INP and CCC, with the latter being a more humorous take). However, this post intended to communicate that the INP is irrelevant to growth economics in terms of nominal quantities, and that empirically there doesn't seem to be anything wrong with adding up capital in terms of nominal value. The CCC was about adding real capital (i.e. espresso machines and drill presses) which is precisely Joan Robinson's "adding apples and oranges" problem. However, using nominal value of capital renders this debate moot as it becomes a modelling choice and shifts the "burden of proof" to comparison with empirical data.

Much like how the assumptions behind the production function lead you down the path of inventing a "total factor productivity" fudge factor because the model doesn't agree with data on its own, they lead you to additional theoretical problems such as the index number problem and Cambridge capital controversy.

...

Footnotes:

[1] Model details and the Mathematica code can be found on GitHub in my informationequilibrium repository.

[2] Funny enough, the information equilibrium approach does mix these quantities in a particular way. We'd say:

$$
\begin{align}
N = & \frac{1}{k} P Y\\
= & \frac{1}{k} \frac{dN}{dY} Y
\end{align}
$$

or in the notation I show $N = P(Y) Y$.

Monday, September 18, 2017

Ideal and non-ideal information transfer and demand curves

I created an animation to show how important the assumption of fully exploring the state space (ideal information transfer i.e. information equilibrium) is to "emergent" supply and demand. In the first case, we satisfy the ideal information transfer requirement that agent samples from the state space accurately reproduce that state space as the number of samples becomes large:


This is essentially what I described in this post, but now with an animation. However, if agents "bunch up" in one part of state space (non-ideal information transfer), then you don't get a demand curve:




Sunday, September 17, 2017

Marking my S&P 500 forecast to market

Here's an update on how the S&P 500 forecast is doing (progressively zooming out in time):




Before people say that I'm just validating a log-linear forecast, it helps to understand that the dynamic equilibrium model says not just that in the absence of shocks the path of a "price" will be log-linear, but will also have the same log-linear slope before and after those shocks. A general log-linear stochastic projection will have two parameters (a slope and a level) [1], the dynamic equilibrium model has one. This is the same as saying the data will have a characteristic "stair-step" appearance [2] after a log-linear transformation (taking the log and subtracting a line of constant slope).

Footnotes:

[1] An ARIMA model will also have a scale that defines the rate of approach to that log-linear projection. More complex versions will also have scales that define the fluctuations.

[2] For the S&P 500, it looks like this (steps go up or down, and in fact exhibit a bit of self-similarity at different scales):


Friday, September 15, 2017

The long trend in energy consumption


I came across this from Steve Keen on physics and economics, where he says:
... both [neo­clas­si­cal economists and Post Keynesians] ignore the shared weakness that their models of production imply that output can be produced without using energy—or that energy can be treated as just a form of capital. Both statements are categorically false according to the Laws of Ther­mo­dy­nam­ics, which ... cannot be broken.
He then quotes Eddington, ending with "But if your theory is found to be against the second law of thermodynamics I can give you no hope; there is nothing for it but to collapse in deepest humiliation." After this, Keen adds:
Arguably therefore, the production functions used in economic theory—whether spouted by mainstream Neoclassical or non-orthodox Post Keynesians—deserve to "collapse in deepest humiliation".
First, let me say that the second law of thermodynamics applies to closed systems, which the Earth definitely isn't (it receives energy from the sun). But beyond this, I have no idea what Keen is trying to prove here. Regardless of what argument you present, a Cobb-Douglas function cannot be "false" according to thermodynamics because it is just that: a function. It's like saying the Riemann zeta function violates the laws of physics.

The Cobb-Douglas production functions also do not imply output can be produced without energy. The basic one from the Solow model implies labor and capital are inputs. Energy consumption is an implicit variable in both labor and capital. For example, your effective labor depends on people being able to get to work, using energy. But that depends on the distribution of firms and people (which in the US became very dependent on industrial and land use policy). That is to say

$$
Y = A(t) L(f_{1}(E, t), a, b, c ..., t)^{\alpha} K(f_{2}(E, t), x, y, z ..., t)^{\beta}
$$

where $f_{1}$ and $f_{2}$ are complex functions of energy and time. Keen is effectively saying something equivalent to: "Production functions ignore the strong and weak nuclear forces, and therefore violate the laws of physics. Therefore we should incorporate nucleosynthesis in our equations to determine what kinds of metals are available on Earth in what quantities." While technically true, the resulting model is going to be intractable.

The real question you should be asking is whether that Cobb Douglas function fits the data. If it does, then energy must be included implicitly via $f_{1}$ and $f_{2}$. If it doesn't, then maybe you might want to consider questioning the Cobb-Douglas form itself? Maybe you can come up with a Cobb-Douglas function that includes energy as a factor of production. If that fits the data, then great! But in any of the cases, it wasn't because Cobb-Douglas production functions without explicit energy terms violate the laws of physics. They don't.

However, what I'm most interested in comes in at the end. Keen grabs a graph from this blog post by physicist Tom Murphy (who incidentally is a colleague of a friend of mine). Keen's point is that maybe the Solow residual $A(t)$ is actually energy, but doesn't actually make the case except by showing Murphy's graph and waving his hands. I'm actually going to end up making a better case.

Now the issues with Murphy's post in terms of economics were pretty much handled already by Noah Smith. I'd just like to discuss the graph, as well as note that sometimes even physicists can get things wrong. Muphy fits a line to a log plot of energy consumption with a growth rate of 2.9%. I've put his line (red) along with the data (blue) on this graph:


Ignore any caveats about measuring energy consumption in the 17th century and take the data as given. Now immediately we can ask a question of Murphy's model: energy consumption goes up by 2.9% per year regardless of technology from the 1600s to the 2000s? The industrial revolution didn't have any effect?

As you can already see, I tried the dynamic equilibrium model out on this data (green), and achieved a pretty decent fit with four major shocks. They're centered at 1707.35, 1836.93, 1902.07, and 1959.69. My guess is that these shocks are associated with the industrial revolution, railroads, electrification, and the mass production of cars. YMMV. Let's zoom in on the recent data (which is likely better) [1]:


Instead of Murphy's 2.9% growth rate that ignores technology (and gets the data past 1980 wrong by a factor of 2), we have a equilibrium growth rate of 0.5% (which incidentally is about half the US population growth rate during this period). Instead of Muphy's e-folding time of about 33 years, we have an e-folding time of 200 years. Muphy uses the doubling time of 23 years in his post, which becomes 139 years. This pushes boiling the Earth in 400 years (well, at a 2.3% growth rate per Murphy) to about 1850 years.

Now don't take this as some sort of defense of economic growth regardless of environmental impact (the real problem is global warming which is a big problem well before even that 400 year time scale), but rather as a case study where your conclusions depend on how you see the data. Murphy sees just exponential growth; I see technological changes that lead to shocks in energy consumption. The latter view is amenable to solutions (e.g. wind power) while Murphy thinks alternative energy isn't going to help [2]. 

Speaking of technology, what about Keen's claim that total factor productivity (Solow residual) might be energy? Well, I checked out some data from John Fernald at the SF Fed on productivity (unfortunately I couldn't find the source so I had to digitize the graph at the link) and ran it through the dynamic equilibrium model:


Interestingly the shocks to energy consumption and the shock to TFP line up almost exactly (TFP in 1958.04, energy consumption in 1959.69). The sizes are different (the TFP shock is roughly 1/3 the size of the energy shock in relative terms), and the dynamic equilibria are different (a 0.8% growth rate for TFP). These two pieces of information mean that it is unlikely we can use energy as TFP. The energy shock is too big, but we could fix that by decreasing the Cobb-Douglas exponent of energy. However, we need to increase the Cobb-Douglas exponent in order to match the growth rates making that already-too-big energy shock even bigger.


But the empirical match up between TFP and the energy shock in time is intriguing [3]. It also represents an infinitely better case for including energy in production functions than Keen's argument that they violate the laws of thermodynamics.

...

Footnotes:

[1] Here is the derivative (Murphy's 2.9% in red, the dynamic equilibrium in gray and the model with the shock in green):


[3] Murphy actually says his conclusion is "independent of technology", but that's only true in the worst sense that his conclusion completely ignores technology. If you include technology (i.e. those shocks in the dynamic equilibrium), the estimate of the equilibrium growth rate falls considerably.


[2] It's not really that intriguing because I'm not sure TFP is really a thing. I've shown that if you look at nominal quantities, Solow's Cobb-Douglas production function is an excellent match to data with a constant TFP. There's no TFP function to explain.

Thursday, September 14, 2017

The long trend in labor hours

Branko Milanovic posted a chart on Twitter showing the average annual hours worked showing, among other things, that people worked twice as many hours during the industrial revolution:


From 1816 to 1851, the number of hours fell by about 0.14% per year:

100 (Log[3185] − Log[3343])/(1851-1816) = − 0.138

This graph made me check out the data on FRED for average working hours (per week, indexed to 2009 = 100). In fact, I checked it out with the dynamic equilibrium model:


Any guess what the dynamic equilibrium rate of decrease is? It's − 0.132% — almost the same rate as in the 1800s! There was a brief period of increased decline (a shock to weekly labor hours centered at 1973.4) that roughly coincides with women entering the workforce and the inflationary period of the 70s (that might be all part of the same effect).

Tuesday, September 12, 2017

Bitcoin update

I wanted to compare the bitcoin forecast to the latest data even though I updated it only last week since according to the model, it should move fast (logarithmic decline of -2.6/y). Even including the "news shock" blip (basically noise) from Jamie Dimon's comments today, the path is on the forecast track:


This is a "conditional" forecast -- it depends on whether there is a major shock (positive or negative, but nearly all have been positive so far for bitcoin).

...

Update 18 September 2017

Over the past week (possibly due to Dimon's comments), bitcoin took a dive. It subsequently recovered to the dynamic equilibrium trend:


JOLTS leading indicators update

New JOLTS data is out today. In a post from a couple of months ago, I noted that the hires data was a bit of a leading indicator for the 2008 recession and so decided to test that hypothesis by tracking it and looking for indications of a recession (i.e. a strong deviation from the model forecast requiring an additional shock — a recession — to explain).

Here is the update with the latest data (black):


This was a bit of mean reversion, but there remains a correlated negative deviation. In fact, most of the data since 2016 ([in gray] and all of the data since the forecast was made [in black]) has been below the model:


That histogram is of the data deviations from the model. However, we still don't see any clear sign of a recession yet — consistent with the recession detection algorithm based on unemployment data.

Monday, September 11, 2017

Search and matching II: theory


I think Claudia Sahm illustrates the issue with economics I describe in Part I with her comments on the approach to models of unemployment. Here are a few of Sahm's tweets:
I was "treated" to over a dozen paper pitches that tweaked a Mortensen-Pissarides [MP] labor search model in different ways, this isn't new ... but it is what science looks like, I appreciate broad summary papers and popular writing that boosts the work but this is a sloooow process ... [to be honest], I've never been blown away by the realism in search models, but our benchmark of voluntary/taste-based unemployment is just weird
Is the benchmark approach successful? If it is, then the lack of realism should make you question what is realistic and therefore the lack of realism of MP shouldn't matter. If it isn't, then why is it the benchmark approach? An unrealistic and unsuccessful approach should have been dropped long ago. Shouldn't you be fine with any other attempt at understanding regardless of realism?

This discussion was started by Noah Smith's blog post on search and matching models of unemployment, which are generally known as Mortensen-Pissarides models (MP). He thinks that they are part of a move towards realism:
Basically, Ljungqvist and Sargent are trying to solve the Shimer Puzzle - the fact that in classic labor search models of the business cycle, productivity shocks aren't big enough to generate the kind of employment fluctuations we see in actual business cycles. ... Labor search-and-matching models still have plenty of unrealistic elements, but they're fundamentally a step in the direction of realism. For one thing, they were made by economists imagining the actual process of workers looking for jobs and companies looking for employees. That's a kind of realism.
One of the unrealistic assumptions MP makes is the assumption of a steady state. Here's Morentsen-Pissarides (1994) [pdf]:
The analysis that follows derives the initial impact of parameter changes on each conditional on current unemployment, u. Obviously, unemployment eventually adjusts to equate the two in steady state. ... the decrease in unemployment induces a fall in job creation (to maintain v/u constant v has to fall when u falls) and an increase in job destruction, until there is convergence to a new steady state, or until there is a new cyclical shock. ... Job creation also rises in this case and eventually there is convergence to a new steady state, where although there is again ambiguity about the final direction of change in unemployment, it is almost certainly the case that unemployment falls towards its new steady-state value after its initial rise.
It's entirely possible that an empirically successful model has a steady state (the information equilibrium model described below asymptotically approaches one at u = 0%), but a quick look at the data (even data available in 1994) shows us this is an unrealistic assumption:


Is there a different steady state after every recession? Yet this particular assumption is never mentioned (in any of the commentary) and we have e.g. lack of heterogeneity as the go-to example from Stephen Williamson:
Typical Mortensen-Pissarides is hardly realistic. Where do I see the matching function in reality? Isn't there heterogeneity across workers, and across firms in reality? Where are the banks? Where are the cows? It's good that you like search models, but that can't be because they're "realistic."
Even Roger Farmer only goes so far as to say there are multiple steady states (multiple equilibria, using a similar competitive search [pdf] and matching framework). Again, it may well be that an empirically successful model will have one or more steady states, but if we're decrying the lack of realism of the assumptions shouldn't we decry this one?

I am going to construct the information equilibrium approach to unemployment trying to be explicit and realistic about my assumptions. But a key point I want to make here is that "realistic" doesn't necessarily mean "explicitly represent the actions of agents" (Noah Smith's "pool players arm"), but rather "realistic about our ignorance".

What follows is essentially a repeat of this post, but with more discussion of methodology along the way.

Let's start with an assumption that the information required to specify the economic state in terms of the number of hires ($H$) is equivalent to the information required to specify the economic state in terms of the number of job vacancies ($V$) when that economy is in equilibrium. Effectively we are saying there are, for example, two vacancies for every hire (more technically we are saying that the information associated with the probability of two vacancies is equal to the information associated with the probability of a single hire). Some math lets us then say:

$$
\text{(1) } \; \frac{dH}{dV} = k_{v} \; \frac{H}{V}
$$

This is completely general in the sense that we don't really need to understand the details of the underlying process, only that the agents fully explore every state in the available state space. We're assuming ignorance here and putting forward the most general definition of equilibrium consistent with conservation of information and scale invariance (i.e. doubling $H$ and $V$ gives us the same result).

If we have a population that grows a few percent a year, we can say our variables are proportional to an exponential function. If $V \sim \exp v t$ with growth rate $v$, then according to equation (1) above $H \sim \exp k_{v} v t$ and (in equilibrium):

$$
\frac{d}{dt} \log \frac{H}{V} \simeq (k_{v} - 1) v
$$

When the economy is in equilibrium (a scope condition), we should have lines of constant slope if we plot the time series $\log H/V$. And we do:


This is what I've called a "dynamic equilibrium". The steady state of an economy is not e.g. some constant rate of hires, but rather more general — a constant decrease (or increase) in $H/V$. However the available data does not appear to be equilibrium data alone. If the agents fail to fully explore the state space (for example, firms have correlated behavior where they all reduce the number of hires simultaneously), we will get violations of the constant slope we have in equilibrium.

In general it would be completely ad hoc to declare "2001 and 2008 are different" were it not for other knowledge about the years 2001 and 2008: they are the years of the two recessions in the data. Since we don't know much about the shocks to the dynamic equilibrium, let's just say they are roughly Gaussian in time (start of small, reach a peak, and then fall off). Another way to put this is that if we subtract the log-linear slope of the dynamic equilibrium, the resulting data should appear to be logistic step functions:


The blue line shows the fit of the data to this model with 90% confidence intervals for the single prediction errors. Transforming this back to the original data representation gives us a decent model of $H/V$:


Now employment isn't just vacancies that are turned into hires: there have to be unemployed people to match with vacancies. Therefore there should also be a dynamic equilibrium using hires and unemployed people ($U$). And there is:


Actually, we can rewrite our information equilibrium condition for two "factors of production" and obtain the Cobb-Douglas form (solving a pair of partial differential equations like Eq. (1) above):

$$
H(U, V) = a U^{k_{u}} V^{k_{v}}
$$

This is effectively a matching function (not the same as in the MP paper, but discussed in e.g. Petrongolo and Pissarides (2001) [pdf]). We should be able to fit this function to the data, but it doesn't quite:


Now this should work fine if the shocks to $H/U$ and shocks to $H/V$ are all the shocks. However it looks like we acquire a constant error coming through the 2008 recession:


Again, it makes sense to call the recession a disequilibrium process. This means there are additional shocks to the matching function itself in addition to the shocks to $H/U$ and $H/V$. We'd interpret this as a recession representing a combination of fewer job postings, fewer hires per available worker, as well as fewer matches given openings and available workers. We could probably work out some plausible psychological story (unemployed workers during a recession are seen as not as desirable by firms, firms are reticent to post jobs, and firms are more reticent to hire for a given posting). You could probably write up a model where firms suddenly become more risk averse during  a recession. However you'd need many, many recessions in order to begin to understand if recessions have any commonalities.

Another way to put this is that given the limited data, it is impossible to specify the underlying details of the dynamics of the matching function during a recession. During "normal times", the matching function is boring — it's fully specified by a couple of constants. All of your heterogeneous agent dynamics "integrate out" (i.e. aggregate) into a couple of numbers. The only data that is available to work out agent details is during recessions. The JOLTS data used above has only been recorded since 2000, leaving only about 30 monthly measurements taken during two recessions to figure out a detailed agent model. At best, you could expect about 3 parameters (which is in fact how many parameters are used in the Gaussian shocks) before over-fitting sets in.

But what about heterogenous agents (as Stephen Williamson mentions in his comment on Noah Smith's blog post)? Well, the information equilibrium approach generalizes to ensembles of information equilibrium relationships such that we basically obtain

$$
H_{i}(U_{i}, V_{i}) = a U_{i}^{\langle k_{u} \rangle} V_{i}^{\langle k_{v} \rangle}
$$

where $\langle k \rangle$ represents an ensemble average over different firms. In fact, the $\langle k \rangle$ might change slowly over time (roughly the growth scale of the population, but it's logarithmic so it's a very slow process). The final matching function is just a product over different types of labor indexed by $i$:

$$
H(U, V) = \prod_{i} H_{i}(U_{i}, V_{i})
$$

Given that the information equilibrium model with a single type of labor for a single firm appears to explain the data about as well as it can be explained, adding heterogenous agents to this problem serves only to reduce information criterion metrics for explanation of empirical data. That is to say Williamson's suggestion is worse than useless because it makes us dumber about the economy.

And this is a general issue I have with economists not "leaning over backward" to reject their intuitions. Because we are humans, we have a strong intuition that tells us our decision-making should have a strong effect on economic outcomes. We feel like we make economic decisions. I've participated in both sides of the interview process, and I strongly feel like the decisions I made contributed to whether the candidate was hired. They probably did! But millions of complex hiring decisions at the scale of the aggregate economy seem to just average out to roughly a constant rate of hiring per vacancy. Noah Smith, Claudia Sahm, and Stephen Williamson (as well as the vast majority of mainstream and heterodox economists) all seem to see "more realistic" being equivalent to "adding parameters". But if realism isn't being measured by accuracy and information content compared to empirical data, "more realistic" starts to mean "adding parameters for no reason". Sometimes "more realistic" should mean a more realistic understanding of our own limitations as theorists.

It is possible those additional parameters might help to explain microeconomic data (as Noah mentions in his post, this is "most of the available data"). However, since the macro model appears to be well described by only a few parameters, this implies a constraint on your micro model: if you aggregate it, all of the extra parameters should "integrate out" to single parameter in equilibrium. If they do not, they represent observable degrees of freedom that are for some reason not observed. This need to agree with not just the empirical data but its information content (i.e. not have too many parameters than are observable at the macro scale) represents "macrofoundations" for your micro model. But given the available macro data, leaning over backwards requires us to give up on models of unemployment and matching that have more than a couple parameters — even if they aren't rejected. 

Sunday, September 10, 2017

Search and matching I: methodology


I read Noah Smith's recent blog post on search and matching theory and simultaneously had two ideas for posts of my own. So this will be a mini-series of two posts: a post on methodology and a post an example of that methodology. This is the first one.

I have on many occasions questioned the attacks on Milton Friedman's pool player analogy, including one of Noah Smith's. With his recent post, I think I understand his issue better; I think his concern is misplaced.

Milton Friedman's pool player analogy is essentially an early motivation for what physicists call "effective theory". Friedman was writing in the 50s and effective theory doesn't make a big appearance until the late 70s in physics (e.g. Weinberg's paper [pdf] on phenomenological Lagrangians in which he says quantum field theory has almost zero content [1]). Effective theory is probably one of the most useful methodologies in physics. The standard model is viewed as an effective field theory. It gets the data right, but we don't really believe electrons are fundamental and a fundamental scalar like the Higgs actually violates some core principles of physics (a fundamental scalar with a mass below the Planck mass is basically nonsense). The theory is wrong about the vacuum energy by over 100 orders of magnitude, but gets g-2 correct to more than 10 decimal places. We say it's an effective theory for measurements at scales well below the Planck scale (or even the GUT scale).

Now it's true that when Friedman said better models will have more unrealistic assumptions that is just plain wrong. In effective theory, you should be agnostic towards the realism of the assumptions (as long as they don't violate symmetry principles (which economics doesn't have)). I also don't think Friedman actually followed his own rules or leaned over backwards to cast doubt or present his biases as a scientist should. But just because someone may have been wrong about a couple of things does not mean everything they've ever said is wrong and therefore we should take the opposite view.

Now I discussed this in terms of scales and scope before in the other post on Noah and the pool player analogy. But in his recent post, he gave me new insight when he says:
I personally think this is silly, because it ends up throwing away most of the available data that could be used to choose between models. Also, it seems unlikely that non-realistic models could generate realistic results. ... Figuring out how things actually work is a much more promising route than making up an imaginary way for them to work and hoping the macro data is too fuzzy to reject your overall results.
The pool player analogy is problematic for economics because economists approach their field more like Solow than Feynman. Before the heterodox Econ people get all self righteous, let me say they don't get it either.

Feynman says you should be leaning over backwards to give evidence that your theory is wrong. If the best you can do is say "it isn't rejected", then you're leaning over backwards to give evidence your model is right. Basically "it isn't rejected" means there's probably a vast literature about why your paper could be wrong and every Econ paper that makes this claim should include this vast literature ― something like "of course this result is questionable, c.f. all of economic research." in every paper.

Basically, you shouldn't publish and you shouldn't be published. It's fine as a thesis project (I'd almost say inconclusive results demonstrate research aptitude ―  the point of a Phd thesis ―  even more than conclusive ones).

We also see Noah following Solow in assuming his own concept of realism (and the human bias that goes with it) is what the "true" theory of economics has chosen. He should be leaning over backwards to reject his own gut feelings about the realism of assumptions. What if macro is emergent and the assumptions about micro don't matter? Then you've placed yourself in a straitjacket of your own making because you had too much confidence in your own insights. Science is about doubting your intuition. Human intuition did not evolve to understand electrons or e-commerce. Why would you think your intuition about what is realistic should apply? Quantum mechanics is "unrealistic": it severely strains most people's intuition about realism. We accept it because it's freakishly accurate. Who are we with our ape brains that evolved to survive in Africa to question the "realism" of the fundamental laws of nature at scales smaller than the wavelength of visible light?

The main point here is that Milton Friedman's pool player analogy isn't about models with unrealistic assumptions that "aren't rejected"; it's about (or supposed to be about) models with unrealistic assumptions that get the data right ―  they forecast well and produce expected errors depending on the models' scope. However, I will cede to Noah that economic practice does seem to be incompatible with the pool player analogy. The analogy becomes dangerous when you switch out "gets the data right" with "isn't rejected" and forget to lean over backwards.

When Noah says:
It's good to see macroeconomists moving away from this counterproductive philosophy of science.
We should take this to mean the lack of leaning over backwards to cast doubt and forgetting to present your own biases. However, physics has successfully been using this philosophy, calling it effective theory, since the 1960s and 70s. If it's counterproductive in economics, it means there's something wrong with economics.

In part II, I will attempt to give an example of these issues in the approach to unemployment and search and matching theory. Update: now available.

...

Footnotes:

[1] From Weinberg (1979):
This remark is based on a "theorem", which as far as I know has never been proven, but which I cannot imagine could be wrong. The "theorem" says that although individual quantum field theories have of course a good deal of content, quantum field theory itself has no content beyond analyticity, unitarity, cluster decomposition, and symmetry. This can be put more precisely in the context of perturbation theory: if one writes down the most general possible Lagrangian, including all terms consistent with assumed symmetry principles, and then calculates matrix elements with this Lagrangian to any given order of perturbation theory, the result will simply be the most general possible S-matrix consistent with analyticity, perturbative unitarity, cluster decomposition and the assumed symmetry principles. As I said, this has not been proved, but any counterexamples would be of great interest, and I do not know of any. 

Thursday, September 7, 2017

Random agents: one tool in the toolbox


Cameron Murray directed me to two papers (links at Cameron's tweet) on so-called "zero intelligence" agents (i.e. random agents) that are relevant to the work I present on this blog. I believe these papers, like Gary Becker's 1962 paper (which both cite), seem to be working within the traditional economic framework because of the terminology. I tweeted:
Still don't like the framing: "irrational", "zero-intelligence". I prefer thinking people are "inscrutably algorithmically complex".
Both "irrational" and "zero-intelligence" are a judgmental comparison to agents that are "rational" or "intelligent". As I said in my tweet, I prefer to think of myself (or other economic theorist) as the one with "zero-intelligence" or at least coming from a place of complete ignorance of human behavior.

That said, the first paper [pdf] "Zero Intelligence in Economics and Finance" by Dan Ladley is a great survey of the use of random agents from 1962 to 2009. David Glasner originally pointed me to Gary Becker's 1962 paper, and I used it extensively in my book to argue against trying to get inside the heads of agents to tell stories. Ladley similarly sees it as opening up the possibilities of economics without rationality:
The Zero Intelligence approach can trace its roots to prior to the advent of agent based computational economics. The Nobel Laureate Gary Becker employed this technique in an early paper (Becker, 1962) to analyse a model of markets in which participants behaved irrationally and in some areas at random. Using this model he found that features such as the downward slope of market demand curves and the upward slope of market supply curves typically associated with rational trader behaviour arose without any individual rationality. As a consequence he was able to deduce that these features were a result of the market mechanism that governed the interaction of the traders. In effect the market was creating system level rationality.
Although Becker managed to demonstrate certain results of market structure using this model, little further progress was made using this approach, as even without trader strategy the model is still very difficult to analyse. It was not until individual based computational simulation techniques became available that this research area made rapid progress.
Ladley's paper continues to discuss an experiment and simulation by Gode and Sunder from 1993 that is effectively what I did here (with John List's paper containing the human experiment) and here (in comparison with Vernon Smith's experiments) more than 20 years later.

The interesting finding (to me) of Gode and Sunder as relayed by Ladley is this:
Strikingly they found that markets populated by Zero Intelligence constrained traders behaved very similarly to those populated by human traders, both markets converged and achieved allocative efficiencies close to 99%. In contrast markets populated by Zero Intelligence Unconstrained traders did not converge and achieved efficiencies of approximately 90%.
The "constrained" traders were traders bounded inside the appropriate state space (opportunity set in Gary Becker's paper), where as the unconstrained ones weren't. This means you can make a case that the "economic forces" at work are more properties of this state space than the agents (which are just a mechanism to explore that state space, like an indicator dye or smoke in a wind tunnel). Again, this is a point I've made before and make more extensively in my book. (I'd also like to note that Wissner-Gross's causal entropy is another constraint on the state space that makes it look like random agents are "intelligent".)

I'd like to return to a point Ladley made in my first quote from his paper, however. He mentions how progress using random agents essentially stagnated until computational resources became more widely available, the models being "very difficult to analyse". While I have made much of random agents myself, my primary work here is on the information equilibrium framework. This framework can be motivated by random agents as one way to achieve maximum entropy, but information equilibrium is more about relationships between different state spaces — regardless of how they are explored. I wrote a brief post about this that has a good diagram:


Random agents (i.e. Maximum Entropy or MaxEnt in the figure above) are one way those state spaces are fully explored, but intelligent agents or other kinds of deterministic algorithms can also explore those state spaces (including causally random agents per Wissner-Gross). This includes inscrutably algorithmically complex agents as well.

This is to say that while random agents are a great tool, they're not the end-all and be-all of the information theory approach which aims to develop a calculus of state spaces (and per Fielitz and Borchardt, "provide shortcuts" to deal with complex systems).

*  *  *

The other paper Cameron sent me is by Doyne Farmer et al [pdf] also discusses Becker and Gode and Sunder:
Traditionally economics has devoted considerable effort to modeling the strategic behavior and expectations of agents. While no one would dispute that this is important, it has also been pointed out that some aspects of economics are independent of the agent model. For example, Becker showed that a budget constraint is sufficient to guarantee the proper slope of supply and demand curves, and Gode and Sunder demonstrated that if one replaces the students in a standard classroom economics experiment by zero-intelligence agents, price setting and other properties match better than one might expect. In this paper we show that this principle can be dramatically more powerful, and can make surprisingly accurate quantitative predictions.
This paper is more a specific application to financial markets. They find the state space they use for the price formation mechanism plays "a more important role than the strategic behavior of agents". However, while this is characterized as a simple model after some digging I found the actual model description and it is far from simple. I believe simple here is relative (again) to a framing in terms of rational strategic agents.

I am trying to understand this paper in more detail. However at this point I am uncertain of the connection between the theory and the conclusions made in the paper. There is a lot of discussion of theory, but the results are mostly fitting some nebulously described functions to data. As they say:

The nondimensional coordinates dictated by the model are very useful for understanding the average market impact function. There are five parameters of the model and three fundamental dimensional quantities (shares, price, and time), leading to only two independent degrees of freedom.

But then the paper simply posits a log-linear function with two degrees of freedom:$\phi (\omega) = K \omega^{\beta}$. Why that one? I assume because the data looks log-linear. In fact, you can recover the basic results of their paper without much more than the information equilibrium condition, which tells us that $\beta$ can be anything between 0 and 1 (Farmer et al find $\beta \sim 0.25$ and cite Gabaix's theoretical calculation of $\beta = 0.5$). Equating orders with demand, the log price is:

$$
\log p = \log k + \log \frac{D_{0}}{S_{0}} + \frac{k-1}{k} \log \frac{D}{D_{0}}
$$

so that the log difference is (for a given order size $\omega \equiv \Delta D$)

$$
\log \phi = \frac{k-1}{k} \log \left( 1 + \frac{\omega }{D_{0}} \right)
$$

and we can identify the slope $\beta = (k - 1)/k$ in terms of the information transfer index $k$. Their finding of $\beta \sim 0.25$ corresponds to an information transfer index of $k \sim 1.3$. (Note that Gabaix's $\beta = 0.5$ means $k = 2$, which gives us the quantity theory of money in another context.) However, the information equilibrium version can allow much more to happen. For example, we can think of this as an ensemble average allowing $\langle k \rangle$ to change slowly over time (and allow e.g. different company stocks to have different values of $k$).

In any case, I plan on looking into this a bit more.

Wednesday, September 6, 2017

Markups, productivity, and maximum entropy


There was a brief period several days ago where markups were all the rage in the econ blogs due to a new paper on market power by Jan De Loecker and Jan Eeckhout. Dietrich Vollrath documents it in his opening paragraph here. He also discusses the paper at length and I recommend his post.

This post is mostly just a few notes about interpreting this in terms of information equilibrium. Vollrath illustrates the basic formula (in terms of a single input/output, just add indices as necessary):

$$
Markup = \frac{Revenue}{MC \times Inputs}
$$

We can rewrite this in terms of an information equilibrium relationship where the marginal costs $MC$ are the abstract price in the "market for revenue" $R \rightleftarrows I$ with input $I$ with IT index $1/\mu$ (see here for definitions):


$$
\frac{\partial R}{\partial I} = \frac{1}{\mu} \; \frac{R}{I}
$$

Rearranging:

$$
\mu = \frac{R}{\frac{\partial R}{\partial I} \times I}
$$

So we can make the identification of $\mu$ with the markup. Now in the paper, they discover $\mu$ is changing. This is exactly what happens if we think of this equation as representing an ensemble average:

$$
\langle \mu \rangle = \frac{\langle R \rangle}{\frac{\partial \langle R \rangle}{\partial I} \times I}
$$

In fact, $\mu$ changes exactly how we'd expect it to if this was coming from an ensemble:



The key thing to recognize is that the markup in the first graph is the inverse of the IT index graphed in the second graph: a rise in the markup is a fall in the IT index. The second graph is from this post on falling labor productivity growth (see also here). Interestingly, Vollrath and the paper also discuss a connection with falling productivity growth. (Update #1: the graphs have different domains as well, but as the inputs are generally growing $\sim \exp r t$, then there is a log-linear relationship between the magnitude of inputs and time. Update #2: see update #3 below.).

The "market power" story people are telling in relation to this paper (and before) is in the information equilibrium re-telling the exact same story I've told before: there are a lot more ways to construct an economy with a particular growth rate out of many firms in low growth states and a few in high growth states than out of many firms in high growth states. The "maximum entropy" state should result in lower (ensemble average) productivity and (here) higher (ensemble average) markups. See here for a more detailed explanation in terms of partition functions.

...

Update #3: 

Actually reran the computation to show an apples-to-apples comparison. First, there is the IT index versus inputs (as above):


Then there are the exponentially growing inputs:


Putting these together (and using IT index = 1/markup) with the markup data from the paper (red) as an overlay:


Also, this is what an example Monte Carlo throw for available state space for the ensemble of firms' markups looked like (q.v. here):


Tuesday, September 5, 2017

Demand curves under the microscope

John Handley has a fun rant in favor of using calculus in "economics 101" (actually AP economics). He quotes the explanation of the demand curve
There are a lot of people who come to a market sells one item. Each person is willing to buy the item at any price lower than some arbitrary price, so if the owner of the market comes out and declares a high price, relatively few people will buy the item. Similarly, if the owner declares a low price, many people will buy it.
Handley correctly says this results in "a weird demand curve with 'steps' at different prices whose width is determined by the number of people will their maximum price at that level." In fact, here is one such demand curve:


Now I've written about this explanation really defining more of an inverse survival curve than a demand curve (i.e. in this case the agents really are defined some distribution over prices they will accept, which is amenable to random agent models). However, there's an interesting place we can take this "economics 101" explanation.

Handley continues:
This is entirely different from the smooth curves instructors like to draw to illustrate demand, and inconsistent with the math used when teaching firm behavior (marginal revenue doesn't make sense when the demand curve is a bunch of steps).
Yes, the derivative (the marginal quantity) doesn't make sense for such a curve because it is basically zero everywhere except for a discrete set of points. It looks (sort of) like this:


The actual slope of the demand curve (if we fit a line to the data points) would be ‒1/5, indicated by the black dashed line. Now when I said "sort of", that's because I didn't compute the actual derivative —  I computed a finite approximation of a derivative.

The derivative of a curve in calculus is actually the result of a process where you measure the "rise" of the curve divided by the "run" of the curve (how far the curve goes up or down along the y-axis divided by how far along you move along the x-axis). The process that gives you the derivative has you decreasing the steps along the x-axis (Δx) until the step size is basically zero (an infinitesimal referred to as dx). Note that Δy will also decrease, but if the curve has any slope at all, then even as Δx goes to zero the ratio Δy/Δx will converge to some number. That number is the derivative. In this case, on average the demand curve falls by about 1 unit for every 5 you move across (hence ‒1/5), but "instantaneously" (for very short steps Δx) it actually falls a whole unit almost instantly resulting in those spikes in the graph above.

In that graph, I stopped before Δx reached zero (the x-axis is people so Δx is Δpeople). The result is a series of spike. If I had let Δpeople go to zero, those spikes become infinitely tall. The marginal utility (or revenue) (i.e. the derivative) would just be a function with values of zero and infinity. It's not very illuminating, and totally disconnected from the way economics it taught with calculus.

However!

In physics we tend to think of that Δx as a "scale": some sort of basis for measurement. As Δx goes to zero, we can think of a microscope "zooming in" on whatever we're looking at. The "scale" of atoms is much more zoomed in than the scale of biological cells when Δx is a distance measurement (in microns or meters).

In our example, our scale is the number of people. As we zoom in to a scale that is smaller than a single person (Δpeople < 1), you get that spiky derivative picture. However, if we zoom out to a scale where we no longer resolve individual people (Δpeople > 1), that derivative starts to converge [1] to the "slope" of that demand curve:


The spiky curves (colors) gradually melt away into the black solid line at Δy/Δx = Δprice/Δpeople = ‒1/5 (dashed line).

How do we interpret this? Well, it gives us a scope condition. The marginal approach to economics doesn't make sense unless we're talking about more than one person (in this case, a marginal utility  is more a property of 16 people than a property of a single person).

If we try to resolve marginal quantities at a scale on the order of a single person, it vanishes into nonsense — as if it was an emergent concept.

...

Footnotes:

[1] What is interesting is that if our demand curve was actually curved, then we couldn't really zoom out too far because we'd stop resolving the curvature. This implies either that some curvatures of demand curves don't make sense, or that the curvature disappears as we look at bigger groups of people or whole economies.