The Delhi sero-survey is a classic case of how not to do such surveys or doing a survey for obtaining the pre-determined results
By Padam Singh & Davendra Verma
The National Centre for Disease Control (NCDC), ministry of health and family welfare, Government of India, made public on July 21, the first Covid sero-prevalence survey result for Delhi conducted in collaboration with the Delhi Government from June 27 to July 10. The serological studies/surveys involving antibody studies are undertaken to provide crucial insights into the transmission and immunology of the population, which helps plan appropriate containment and control strategies.
Surprisingly, the sero-prevalence survey estimates a very high 22.83% Corona Prevalence Rate (CPR) in Delhi. This means 22.83% of Delhi’s population has experienced corona in some form, a fact concluded through an antibody test for Covid-19. Thus, out of the total population of Delhi, the report estimates 46.28 lakhs have been infected with coronavirus, and have developed antibodies. However, the officially confirmed cases as on July 10 were merely 1.10 lakh. Further, strangely, CPR among females is higher at 24.20% as compared to 21.63% among males. The survey also estimated higher CPR of 23.13% in the lower age group (< 18 years) as compared to 22.86% in higher age group (> 18 years).
The NCDC study is a community-based cross-sectional study to estimate prevalence in the entire population (including <18 years of age) undertaken during June 27 to July 10, covering residents of Delhi more than a year old, with a sample size of 20,000 (in actual 22,853), and used two types of ICMR approved test kits, namely Covid Kavach (sensitivity: 92.1%; specificity: 97.7%) and ICMR Sero-survey (sensitivity: 86.7%; specificity: 99.5%). Given the lower sensitivity of tests, the survey should slightly underestimate the prevalence.
As per the NCDC report, a multistage sampling was followed. In the first stage, all the districts were selected; in stage two, all the wards were selected. In stage three, there was a selection of primary sampling unit (as decided by the Delhi health department as per operational feasibility, i.e., dispensary). This was followed by a stage four which involved the selection of unit-households.
The NCDC had claimed that the sero-prevalence survey results are based on a random sample of 20,000 persons (in actual 22,853) spread over all the eleven districts of Delhi, on whom the test for antibodies was conducted. The survey methodology used, and the consequent very high estimates (results) and large inter-district variation in estimates of CPR (12.95% in the south-west district and about 27.7% in Shahdara, north-east and central districts) raise very serious questions on its findings and methodology.
The RT-PCR test for Covid-19 among suspected high-risk contact cases is less than 10%, whereas the survey results claim the prevalence of about 23%, which is highly unlikely. Technically, the results based on antibodies test undertaken on the general population (not high risk) are expected to be lower. Another question on the survey result is on 19.82% of Covid-negative cases showing the presence of antibodies, which is quite surprising. Further, among the Covid-positive, only 53% had antibodies, i.e., as much as 47% were not having any antibodies, which is very difficult to explain medically.
In the calculation of the sample size, the NCDC has assumed the prevalence of 1%, which is not correct. The sample size based on wrong assumptions raises a question mark on the survey itself. The precision of 20% taken for the calculation of the sample size is also too high, which, in turn, will yield a very wide confidence interval for the CPR estimate.
As we have understood, the NCDC had allocated samples to all corporations/districts/wards in proportion to their population. This is only proportional allocation of samples, and no sampling has been undertaken at this stage. It was left to the dispensaries covering the wards to select the individuals using simple random sampling (SRS) for the test as per administrative feasibility. To explain further, the total sample was divided as per the population of the eleven districts and wards, keeping into account their urban and rural population. The decision on selection of individuals was left to the Delhi health department as per operational feasibility of the dispensary.
Thus, there is no sampling of wards, and obliviously there was no multistage sampling. The question is how dispensaries became part of the primary sampling units if they were not a part of the sampling frame. The selection of dispensaries is based on administrative convenience, and hence, claiming it to be random is not correct. Further, the sampling of individuals is also not random. Thus, the sampling design cannot be self-weighting, and therefore, the analysis without weights is not appropriate.
In the absence of any sampling frame or updated frame (since the 2011 census list will be totally outdated), the health workers have selected the individuals as per their convenience without any scientific method, which has affected the estimate of CPR. For conducting the survey scientifically using probability sampling, every person living in the ward must have an equal probability of selection in the sample. By leaving it to dispensaries (for administrative convenience), the basic principle of sampling has been greatly compromised, resulting in a gross overestimation of CPR. In fact, it is a classic case of how not to do such surveys or surveying to obtain pre-determined results.
In conclusion, the NCDC must take into account technicalities of survey sampling in calculating sampling size and finalising their sampling design and methodology for the next round of the sero-prevalence survey they are planning for correctly estimating CPR.
Singh is former additional director general, ICMR, and Verma is former director general, CSO. Views are personal