# Arvind Subramanian’s attempt to estimate GDP instead of explaining it through cross-country regression fraught with problems

## As the dependent variable, he had heights of actresses on the Bengali stage in the 19th century, with a clear perceptible decline in heights towards the end of the century. Among

A hinny is a cross between a male horse and a female donkey, a mule is a cross between a male donkey and a female horse. Economists love regressions. For non-economists, regression does not mean past life regression. When an economist uses the word regression, he/she means a statistical technique, used to establish a relationship between independent variables and a dependent variable. The word originates from work done by Francis Galton (1822-1911) in the 19th century, with a database that had heights of 1,078 fathers and sons.

Tall fathers had tall sons, but on an average, sons were not as tall as their fathers. Galton called this regression to the mean and thus the word was coined. With computers, running regressions is easy. I remember the days when calculators were used. Armed with new methods and technology to regress, economists routinely prescribe policies for progress. I know someone who is an economic historian.

As the dependent variable, he had heights of actresses on the Bengali stage in the 19th century, with a clear perceptible decline in heights towards the end of the century. Among independent variables, there was some index of female education and there was a statistically significant negative relationship between spread of female education and decline in heights. I don’t remember the weird hypothesis he concocted to explain this apparent anomaly. Suffice to say, in initial years of the Bengali stage, roles of women were played by men.

Eventually, women freely joined the stage. Had the mind not been clouded by regressions, this economic historian would certainly have thought of the common sense explanation.

Arvind Subramanian’s working paper on India’s GDP mis-estimation has gained a lot of traction and attraction. The “evidence” for the head-line grabbing real average Nehruvian GDP growth of 4.5% between 2011-12 and 2016-17 is “cross-sectional/panel regression,” the substance of Part III of the working paper. This column is restricted to Part III. Part II of the paper is a “prima facie” case based on correlations. We will discuss that in a subsequent column. Since this is a regression exercise, there will be independent variables and a dependent variable whose behavior one is trying to explain. The dependent variable is real GDP growth. The independent variables are credit to the private sector, electricity consumption, export growth and import growth. Any postgraduate research student is asked to think really hard before estimating any econometric model. Real GDP growth as the explained variable? For a country like India? With shares for agriculture (primary sector) and services (tertiary sector)? I am not trying to explain real growth for manufacturing, but real growth for the entire economy. I suspect, if a research student tried to do this, the supervisor would ask him/her to go back and think again. In that process of thinking, willy-nilly, one would think of productivity too and about how that is being measured, or not being captured at all. Better still, one would think of compositional shifts in GDP within manufacturing alone. Implicit in all this, there are notions of growth theory and production functions.

There is also a difference between explanation and estimation. For instance, to take but one example, I have seen cross-country regressions where the explanatory variable is per capita GDP and the explained (or dependent) variable is human development index (HDI) scores. As in any regression exercise, there are countries bang on the regression line and there are countries off the line. For countries on the line, in this limited exercise, per capita GDP has been able to explain HDI behavior rather well. For countries off the line, per capita GDP hasn’t been able to explain HDI behavior that well and one must look for explanations beyond per capita GDP.

Simply because I have been unable to explain HDI behavior, I will normally not argue the HDI data are wrong. But this is precisely what the former Chief Economic Adviser (CEA) has done in Part III of his paper. On the basis of those four explanatory variables, I haven’t been able to explain real GDP growth. Had India been on the line, real growth would have been 4.5%. Since India isn’t on the line, real growth of 7% must be wrong. We must shave off 2.5%. This is the substance of the headline-grabbing argument. Since the days of Simon Kuznets in late 1930s, economists (and statisticians, condemned a bit in the paper) have sought to measure GDP through various methods of national income accounting and have sought to make it better and more precise, the production approach and the expenditure approach and so on. And in the growth theory tradition, there are those who have tried to explain real GDP growth of countries. But to the best of my knowledge, this is the first time someone has tried to measure/estimate GDP in this way. Of course, there is always a first for everything.

Read Also| NBFC crisis keeps getting worse, wait-and-watch by govt isn’t quite a solution

It gets worse. Had that not been the case, one would have called this a mule, not a hinny. This estimation of GDP, not explanation, is not being done for a single country. It is a cross-country exercise, fraught with more problems. Indeed, that explains why those four explanatory variables were chosen. “These are available for a large sample of countries.” This is choice based on convenience, not on what is reasonable. I should also think of countries with which India can reasonably be compared, such as emerging economies. Instead, “To ensure cross-country comparability, we exclude from the core sample “atypical” countries which we define as oil exporters, small economies (population of less than 1 million), and fragile countries, experiencing conflict or other serious breakdowns/disruptions.”

Therefore, the comparison is with all middle income countries. A “dummy” is thrown in for India in 2011. For the uninitiated, a “dummy” is used to pinpoint a change, in this instance, change in national accounts in India. There are no dummies for other countries. To take but one example, there could have been a dummy for commodity prices, something that might conceivably have affected several countries. Note that none of these alternative specifications have been tried out. Or, if they have been tried out, they have not been reported. To put it very mildly, there is a cavalier attitude in the way the paper goes about cross-country regressions.

The bottom-line agenda is simple. India is an outlier. Ipso facto, Indian GDP (real growth) figures must be wrong. There are other outliers too. “Two prominent outliers are Ireland and China (there are others too) but the difference with India is that their GDP growth is over-estimated in both time periods and by more than in the case of India.” Therefore, we can ignore them and talk only about India. “So, instead of the reported headline growth of about 7 percent between 2011 and 2016, the results in this paper suggest a range for actual growth of between 3.5 and 5.5 percent.” Such a growth rate evidently meets the smell test. It certainly does. With these kinds of growth numbers, say 4.5%, the tax numbers cannot be right. They too are smelly, like the fish sold by Unhygienix. Since these new GDP growth numbers have been established beyond reasonable doubt (read statistical significance), the tax numbers must have been cooked up. No other logical conclusion is possible. That whinny by the hinny will no doubt be in the next working paper.
(Second of a multi-part series)

Chairman of the Economic Advisory Council to the PM. Views are personal