As technology becomes more accessible and cheaper, the data it generates is increasing in size more rapidly than ever before. We have travelled from mainframes to personal computers to laptops to phablets and now smart phones. While computing devices are getting smaller, data is getting bigger every day. Both, enterprises and governments, are facing challenges to effectively manage the data whether it’s structured or unstructured; homogeneous or heterogeneous. Data is produced each time a person follows a blog, shares a tweet or even hits the like button on Facebook. Seeing the rate at which data is getting generated each day, regular databases may not be sufficient. We need databases that are powerful enough to not only store data, but also be able to provide a platform for data analytics, quick user queries and security of tables and fields in a modular manner. This is where technology, like Very Large Databases or VLDBs, come into play. VLDBs are databases large enough to hold data ranging in terabytes, or even petabytes.
According to a Deloitte report VLDB: Prerequisite for Success of Digital India, well-managed data can prove useful in unlocking new sources of economic estimates, fresh understanding into science and holding governments accountable. The presence of the “industrial revolution of data” is being felt all over the world, from science to the arts, from business to government. Digital information increases tenfold every five years that results in a vast amount of data being shared. This can be attributed to the improvements in algorithms driving computer applications.
In 2014, prime minister Narendra Modi introduced the Digital India programme, which aims to connect every citizen of India using technology. A project this huge, has never before been seen/attempted in India. It comes with its fair share of challenges, one of which is the amount of data that will be generated due to the numerous services entailed in it.
Services like e-healthcare solutions, the KhoyaPaya portal for lost and found children, Massive Open Online Courses (MOOCs), etc., will especially face a major challenge in terms of the amount of data they will be handling regularly. These services will require Very Large Databases to contain their data. VLDB will provide the services a single source to store all their scattered data. But having a large enough database will not be a complete solution; accessing that data at high speeds, maintaining data confidentiality, integrity and availability will be equally important.
The primary challenge Digital India services will face when incorporating VLDB into their systems will be coping with the growing data volumes, maintaining data access speeds despite the increasing size of the database, ensuring data duplication is avoided to efficiently use the space in the database, efficiently managing data access permissions and ensuring data confidentiality and integrity. Special methods or techniques have to be developed to deal with each of the above mentioned challenges for VLDBs.
With multiple services being interlinked in the Digital India programme, a major concern would be the duplication of data within the same or across multiple databases. This kind of data redundancy might be unintended. Special care will have to be taken to ensure data duplication does not occupy the limited space within these databases. At the same time, it will be extremely necessary that data concurrency is maintained across these databases to ensure unhindered usage of each service.
Securing data against malicious users will be another very essential area to work upon. The Deloitte report emphasises on the need to have tailored security policies for various types of data, enforcement of those policies on the databases, carrying out regular vulnerability assessments on these databases, using the appropriate encryption techniques and deploying the tools such as Database Activity Monitoring (DAM) for continuous monitoring of the data for their integrity and confidentiality.
Key challenges pertaining to the implementation and use of VLDBs in Digital India programme are discussed below:
- It is important that suitable security measures covering all aspects of data life-cycle are considered.
- Data classification and security policies should be defined and implemented.
- Regular security testing of the databases and underlying infrastructure to ascertain vulnerabilities should be performed.
- Suitable encryption techniques should be used to protect the confidentiality of information.
- Solutions like database activity monitoring should be deployed for continuous monitoring for security issues.
- Adequate data backup and restoration policies are defined and enforced in these systems. An automated data backup solution should be considered to enable data backup. Regular testing of the backed up data should be performed to gain assurance on its integrity and availability.
- Need-to-know and need-to-have are the best principles that form the key to limiting user access. User access remains a concern across organisations and industries.
- Data access policies and processes should be defined as guiding principles for access management.
- For critical data access, additional authentication mechanism such as multi-factor and biometric technique should be considered.
- Solutions like identity & access management and privilege access management should be considered for controlling the access to data in various initiatives of Digital India programme.
Several organisations have adopted VLDBs to effectively manage the large and heterogeneous data and are reaping its benefits. Due to the sheer complexity of the Digital India programme and the enormity of data it will produce, government organisations and departments may consider employing VLDB. Digital India programme will also need application of data science to build predictive and detective models for service efficiency and proactive decision making to bring in efficiency in services and improve citizen experience.
Parthasarathy is partner and Shukla, director at Deloitte. Views are personal