Is social science progressing? This is a very hard question to answer. Hundreds of psychologists, economists, sociologists, and others around the world are working hard on important problems. They (and the rest of us) hope that their hard work contributes to scientific progress in some real way. It would also be nice to think that their work makes the world a better place. Is there a way to measure these things? Can we quantify the extent to which a field is progressing?
One standard approach is bibliometric. New social science research is almost always published in an academic journal. So counting the number of papers that are published every year, taking into account their length, is one easy way to capture scientific progress. Important or groundbreaking research also tends to be cited by other researchers, so citation counts can also help capture impact and progress. If we believe in this approach, then the more papers we write, and the better cited they are, the healthier the progress of the field.
However, bibliometrics is fundamentally detached from reality. There are at least four reasons why counting papers and citations can never be a perfect measure of scientific progress.
First, some published results are simply incorrect. Whether because of fraud or because of honest mistakes, many things can lead to findings that are totally false getting published in reputable journals. Simply counting papers measures progress imperfectly because we can never tell what proportion of published results are false.
Second, some correct results are not new. There are so many thousands of published results in so many fields, that even the most erudite scientists occasionally write a paper that is nothing more than a rehash of ideas that are a few years or many decades old. We wouldn't want to count repetitions in our measure of scientific progress.
Third, some results are true, and new, but are extremely insignificant. They represent only a tiny, marginal increase in our knowledge. A simple count of papers doesn't take into account their significance.
Fourth, many results are celebrated by academics but are ignored anywhere outside of academia. Even if science is chugging along and progressing quickly, one might think that if it's not noticed, it doesn't matter.
It is easy to imagine situations in which any or all of these four reasons make bibliometrics totally invalid for measuring research. For example, it is easy to imagine a situation in which journals published thousands of pages of research, but all of it was either worthless or was a rehash of previous research. It is also conceivable to imagine a situation in which all of the journals shut down, but research still progresses (for example by private firms) and simply isn't published.
Because of this, I want to propose another possibility for measuring scientific progress. Put simply, I think that we should test the predictions that our scientific models make. I am inspired by the story of Einstein testing general relativity. Einstein had a model of the universe that he believed was an improvement over Newton's model. Importantly, his model made testable predictions about the things we should see in the universe that were distinct from what Newton's model predicted. In 1919, people made some observations, and sure enough, the observations appeared to be more consistent with Einstein's model than with Newton's. Because Einstein's model made better predictions than other models, the world knew that physics had progressed, and that the universe was more understandable than it had been.
In principle, we could look at the models of the universe that physics has given us over the centuries, and rate each of them based on how close their predictions were to empirical observations. Progress could be measured by the marginal improvement that each model offered over its predecessor.
Can a similar model be followed in the social sciences? I think so, and I will show you my initial attempt.
The strategy. I will look at data that I also looked at in this project. To recap, the Philadelphia Federal Reserve Bank solicits predictions about economic indicators (like GDP and unemployment) from professional economists. These professionals have been trained in contemporary economic models. If economics is improving, then their predictions should be improving.
Possible outcomes. We can imagine several possible outcomes:
- Consistent improvement. Errors in predictions steadily decrease over time.
- Asymptotic improvement. Errors in predictions decrease, but approach an asymptote.
- Static equilibrium. Errors don't change.
- Decline. Errors get worse over time.
The data. This plot shows the record of several decades of unemployment rates in the US:
Professional predictions. This chart shows the predictions made by professional forecasters:
This chart shows the actual and predicted unemployment overlaid on one chart:
In some ways, the predictions look pretty good. However, note that many of the red dots are close to the black dots laterally rather than vertically, but errors are measured vertically. This chart shows the error of the mean predictions (actual minus mean prediction):
Large positive errors represent optimism, and large negative errors represent pessimism. To get a sense for when pessimism and optimism reign, look at this plot that shows mean errors overlaid on actual unemployment:
With mean errors, pessimism can cancel out optimism. This plot shows mean absolute errors of predictions, another and probably better measure of prediction accuracy:
The lowest mean absolute error happened in 2006, but it was close to being tied by predictions in many previous decades. This plot overlays crosshairs on the lowest error.
There are many things to see in this plot. For example, the financial crisis was not widely predicted, and led to the largest absolute errors in the data. I'll leave it up to readers to decide whether predictions are improving over time.
Agreement among economists. Finally, independent of whether forecasters are improving, we might wonder whether forecasters are agreeing with each other more than they used to. This plot shows the standard deviation of forecasts of unemployment rates for several decades:
This plot shows both plots overlaid on each other:
My reading of this data is that predictions are not improving over the decades. I think that this bodes ill for economics as a science. Of course, I don't think that this method needs to be unique to economics. I think (or at least hope) that something like this could be done for other social sciences.
Post by Bradford Tuckfield
I offer a full range of data analysis services. To set up a free consultation, email me at email@example.com.