Danny Ben-Shahar gave a really nice paper
(co-authored with Roni Golan) at the ASSA meetings yesterday about a natural
experiment in the impact of information provision on price dispersion. I
want to talk about it, but first a little background.

Price dispersion is an ingredient in
understanding whether markets are efficient. When prices for the same
good vary (for reasons other than, say, transport costs or convenience), it
means that consumers lack the information necessary to make optimal decisions,
and the economy suffers from deadweight loss as a result.

Houses
have lots of measured price dispersion, even after controlling for physical
characteristics. Think about a regression for a housing market, where

HP
= XB + h+e

where HP is a vector of house prices, and X is a
matrix of house characteristics. The
residual h+e has two components—unmeasured
house characteristics, h, and an error term, e, which reflects “mistakes”
consumers of houses make, perhaps because of an absence of information. The h might reflect something
like the quality of view, or absence of noise, etc.

When we run this regression, we can compute a variance
of the regression residuals. Because we
can only observe h+e, we cannot know if this
variance is the result of unobserved house characteristics, or of consumer
errors. But if h remains fixed, and
there is an information shock that reduces consumer errors, e will get smaller, and
so will the regression variance.

Here is where Danny’s paper comes in. In April 2010, authorities in Israel began
publishing on-line information about house transactions, and in October 2010,
they launched a “user-friendly web site.”
(Details may be found in the paper). The paper measures the change in measured price
dispersion before and after the information was publicly available, and, at
minimum, found reductions in dispersion of about 17 percent. The paper takes
pains to make sure their result isn’t a function of some shock that happened
simultaneously to the release of the information. For example, they show that price dispersion
fell less in neighborhoods with well-educated people. This could either reflect that (1) well
educated people were better informed about housing markets to begin with, and
so got less benefit from the new information or (2) that a greater share of the
residuals in well-educated neighborhoods comes from non-measured house characteristics.[i] In either event, the result is consistent
with the idea that the information shock is what contributed to the decline in
measured price dispersion.

So more information really does seem to produce
a more efficient housing market. The
policy implication may be that data, in general, should be a public good. Data meet half of Musgrave’s definition of a
public good—they are non-rival (one person’s use of a data-set does not detract
from another person’s use). And while
data are excludable (services such as CoreLogic show this to be true), their
creation produces a classical fixed-cost marginal-cost problem. The fixed cost of producing a good dataset is
very large; once it is created, the marginal cost of providing the data to users
is very low. This suggests that the efficient
price of data should be very low.

Currently, data services have something like
natural monopolies, with long downward sloping average cost curves. Theory says that this means they are setting
prices such that marginal revenue equals marginal costs, instead of setting
price equal to marginal cost. All this
implies that data are underprovided.
Danny and Roni’s work shows that this under-provision has meaningful consequences
for the broader economy.

-->

[i]
BTW, this second interpretation is mine (I don’t want the authors on the hook
for it if they disagree).