By PC Mohanan
As an administrative measure, all governments favour conducting a census of people. In ancient Rome, Augustus Caesar was fond of censuses—it is believed that Joseph and a pregnant Mary travelled to Bethlehem to be enumerated for one such census, and that Christ was born during this trip. Population census is now an essential tool for modern governance, although a few countries with excellent civil registration systems have discontinued periodical censuses. In countries like India, we not only have a population census, but also censuses of livestock, tigers, houses, agricultural holdings, minor irrigation schemes, economic units, handlooms, below-poverty-line people, and even a census of people and their castes!
Currently, the ministry of statistics is conducting a nationwide census that enumerates all economic activities, an exercise as large as the population census, to prepare a ‘business register’. Clearly, ‘registers’ are the current fashion for statisticians and administrators. The National Population Register (NPR), now in the news, is actually an extension of the population census.
The government has cleared a budget of `8,754 crore for conducting the 2021 census. Along with the population count, the Census Office collects data on a variety of other topics relating to population. The data collected is so large that even with the aid of modern technology, many reports are published using only a sample of the data collected. Further, a lot of data are published after a time-lapse—the migration data from the 2011 census came out only recently. Considering the volume of data, such delays are understandable.
The census methodology is clear and easy to understand; send out officials to the homes of people, count them, and come back with the filled-in forms. Mobile handsets/hand-held devices are likely to replace paper schedules for the current census. It is this conceptual simplicity that, perhaps, explains why census remains an attractive tool for administrators and is the first step in dealing with any economic or political problem.
Any data-collection exercise like a census suffers from two kinds of errors—coverage error, and content error, or the error in response or recording. Coverage errors occurs when one either misses or double-counts the target population. To know the magnitude of these two types of errors, the Census Commissioner does a post-enumeration check in a sample of blocks. Content errors are more difficult to judge, and vary from item-to-item in the questionnaire.
For the 2011 census, the post-enumeration check showed a net omission of 23 persons for every 1,000 persons enumerated, an estimated undercount of 23.1 persons, offset by 0.1 person for every 1,000 persons being counted more than once. In comparison, the 2001 check also showed that 23 persons per 1,000 enumerated persons were omitted, net of duplication. In comparison, the undercount in 1991 was 18 persons per 1,000 enumerated persons, similar to that of 1981 census. In 2001, the undercount was much larger in urban areas (40 per 1,000), and the northern zone, which includes Delhi, had an omission rate of 57 per 1,000 for males and 59 per 1,000 for females. Greater mobility in a city like Delhi may be the main reason for these high omission rates. In 2001, Delhi had an undercount of over 80 per 1,000 for males. Not counting the 2.3% of the population in 2011 would indicate that we have no information for 2.8 crore people of the country.
In 2014, the home minister, in response to a question posed in the Rajya Sabha, stated that an electronic database of 118 crore persons was prepared in the NPR from the 2011 census. The 2011 census had enumerated 121 crore people. Together with the likely undercount in the census, this would suggest that close to 6 crore people were missed in this NPR exercise. The scale of these data-gathering exercises would scare most statisticians.
As per the Census Office’s website, the objective of the NPR is to create a comprehensive identity database of every ‘usual’ resident in the country. A ‘usual’ resident is always identified with reference to a geographical location at a point of time. The existing identification through Aadhaar has no territorial identity. As now understood, the NPR is prepared at the local (village/sub-town), sub-district, state and national level under the provisions of the Citizenship Act, 1955, and the Citizenship (Registration of Citizens and Issue of National Identity Cards) Rules, 2003.
Mobility of the people is another issue that complicates the preparation of any location-based registers. The percentage of migrants in the population was 37.6 in 2011, of which 31% had migrated or changed their usual place of residence in their village or town after 2001. Thus, one would expect that roughly 12% of the people would not be found in the place where they were enumerated in 2011. How does one update NPR details for them if they cannot be found by the enumerators? Calling the coming census an exercise to update the 2011 NPR appears to be a misnomer; it has to be a new exercise altogether. Updating is possible only if current and past records can be unambiguously matched through appropriate IDs and corrections incorporated, with new members inserted and dead persons deleted.
The house listing schedule of the forthcoming census has 34 items for information for each household, besides the location particulars that identify the household on the ground. The information collected for the NPR in 2011 included a lot of text data, like names of persons, names of places, addresses, etc. In addition, it is now proposed that more details, like Aadhaar number, voter card, phone number, driving licence number, passport number, etc, will be added. It is expected that all these will be entered into a hand-held device by the enumerator, usually a primary school teacher. The content errors, and the unending travails of the citizens to correct them, can be visualised only in a Kafkaesque scenario. A lot of data in the NPR, like educational status, marital status, occupation, etc, are also not constant for a person and, given the time-lag in the finalisation of registers, would have very little validity when the list is finalised.
The experience from the economic censuses conducted by the ministry of statistics to prepare a ‘business register’ much like the NPR, with establishments replacing people, is well known. The imperfectness of these registers was established when the NSSO used one of these for its 74th round survey, and found most of the names in the register untraceable on the ground.
A major concern for the researchers using the 2011 population census data, especially internal migration data, would be the effect of the NPR, and the rising controversy of it being a base document for the NRC, on the reliability of census data. Recent migrants always have a problem producing an identity or address proof for their current place of residence. Will they, then, be reported as migrants? Alternatively, people may report a longer residency at the place of enumeration to dilute their ‘migrant’ character.
The world over, surveyors are concerned about respondent burden and bias in reporting. The latter is quoted by the government as the reason for the lower toilet coverage reported from one of their own recent surveys. For census professionals, it is a nightmare to organise a census amidst fears and threats of likely disenfranchisement based on it, and maintain the credibility of data. The dramatic increase in the items on which information is to be collected would test the patience of both the enumerators, and the respondents.
The author is former acting chairperson of the National Statistical Commission