By Prakhar Misra & Sharmadha Srinivasan
Last month, A report on regulating non-personal data was released by MeitY. The report audaciously tries to reign in the dominance of technology companies. Yet, considering the urgency that the report purports, its conceptual framework has a long way to go. It is based on an attractive, but impractical idea—community data. The report does well in parts, but it would benefit immensely from a better definition of communities, and the basis of their ownership of, and rights to, non-personal data.
The definitions of community and community data are too broad. All anonymised personal data and data of “inanimate and animate things or phenomena” tied to a community is considered as community data. Thus, all datasets, from those owned by municipal corporations to raw, unprocessed data owned by telecom and e-commerce companies, could fall under this category.
The definition of a community is broad too—“any group of people that are bound by common interests and purposes involved in social/economic interactions”. An anonymised dataset of electricity bills of residents from a discom and another of users on a Facebook group—say, Manchester United fans—both qualify as community data. This is problematic on three levels—one conceptual, and two operational.
First, conceptually, the framing of community data is an effort to bring data regulation under the ‘commons’ framework, but the nuances are missing. Elinor Ostrom, a Nobel laureate in Economics, lays out the principles for commons and differentiates between lakes and rivers, where one has a boundary, and the other doesn’t. She argues that clearly defined boundaries of resources (and individuals) are key to governing commons. The characteristics of non-personal data—it is non-rivalrous, and can be replicated and affect a disparate set of individuals with no common interests or even a shared geographic identity—make it more like a river than a lake. To have a successful commons framework, clear boundaries are crucial. Further, broad definitions and unclear demarcations make it difficult to set operational rules to govern and exercise rights for the collective interest.
Second, the difference between community non-personal data, and public and private non-personal data is unclear. As the report outlines them, public non-personal data is that which is collected or generated by the government/public funded works, and private non-personal data is collected or produced by entities other than governments. Thus, between the two categories, the universe of all non-personal data is captured. Community data cuts across the two types, yet, is typified as a third in the report. Would a census dataset collected by the government fall within the ambit of a community or public non-personal data? If it is both, then how would regulatory protocols apply to this set of data, especially when enforcing the rights of public and private entities vs the community?
Broad and unclear definitions increase the potential for wrongful interpretation, leading to harassment and regulatory capture. Section 66A of the IT Act, 2000 serves as a cautionary tale here; thankfully, it was struck down by the Supreme Court.
Third, the report envisages an ‘appropriate’ data trustee which could exercise the rights on behalf of the community. The principle for selecting the data trustee is the ‘closest’ and ‘most appropriate representative body’ for the community concerned. There are two clear issues. First, the definition of who constitutes the relevant community, and thereby, which community the data trustee should seek to represent is fuzzy. Consider a dataset of anonymised electricity usage of consumers in Delhi. Is the relevant community all residents of Delhi irrespective of whether they are an industrial or household establishment? Further, would the ‘appropriate and closest’ data trustee exercise the rights of the consumers of households or those of industries? Or would data trustees be nested within each other?
Second, the power relationship between the data trustee and the community is not defined. It is assumed to be a good faith contract, which is completely unhelpful in securing the rights of communities over their data, especially considering how valuable such data would be. In its defence, the report does leave some part of such thinking to discussions downstream. However, a theoretical underpinning to these concepts should have been made clearer.
To be clear, we are not dismissive of the possibility of this idea, but only that the current framework is infeasible. Here is a potential way forward: root the discussions of the ‘community’ in various statutes of the Indian Constitution and see how group rights are traditionally awarded.
Such an approach would have to factor in qualities of non-personal data too, as a second filter to refining such definitions. The report does a great job of initiating discussions on these issues, but there is a long way to go before this can become an actionable framework.
Misra is Senior Associate and Srinivasan is Associate, IDFC Institute Views are personal