The inclusion of certain clauses related to non-personal data sets in the Personal Data Protection Bill (PDP) carries a very high risk of re-identification and can lead to legal complications for stakeholders, public policy experts and senior executives said. of the industry.
“High-value data sets that have been created using personal data by anonymizing it continue to carry risks. There is a very clear danger of re-identification and that is a danger that continues to grow. It is no exaggeration to say that there is no set of anonymized data that is permanently anonymized, ”said a senior industry executive, who declined to be named.
For example, Section 91 of the latest version of the PDP Bill, which gives the central government powers to direct fiduciaries and data processors to grant access to all anonymous or non-personal data, runs a high risk of re-identification. .
As technology, including mathematical algorithms, evolves and improves, the science of re-identification will also improve, increasing the risk of anonymization of the dataset, experts said.
“Anonymized data is often easily re-identified and causes significant privacy damage. Setting standards for anonymity and providing more clarity on how various stakeholders will have to collect and store data to avoid regulatory arbitrage will help, ”said Kazim Rizvi, founder of the public policy group The Dialogue.
Aside from issues related to re-identifying anonymous data sets, another issue that can arise over time is that each time a new data set is published by companies and other stakeholders, it may overlap with previously available data sets, which then they become becomes a privacy pain point.
As such re-identification occurs, companies will be held liable under the upcoming data protection bill, putting them in trouble for little or no fault of their own. The lack of a clear definition of what constitutes non-personal data is another concern, experts said.
“The last draft we saw in 2019 didn’t really give us any details about what kind of non-personal data, the process the government will have to follow, if there will have to be compensation, how consent and anonymity will work. None of those things are resolved. I think it largely leaves the framework for how that will happen in subsidiary legislation, ”Udbhav Tiwari, Mozilla’s public policy adviser, told indianexpress.com during a call.
The risks and obligations that accompany the transfer of personal data and the government’s request for non-personal data sets will negatively affect both companies and individuals, said another senior public policy executive. One of the best examples of re-identification, Tiwari said, were companies that use browsing history to identify and predict the behavior of individual users on the Internet.
“There has been very good technical research that says that 60-100 items of a user’s browsing history can be used to uniquely identify them on the Internet. I don’t need to know your name, your email ID, or any other unique identifier. Despite that, it could be said that I can identify you on the Internet only on the basis of your browsing history, ”he said.
Experts have said that these loopholes in the bill will likely lead to bigger problems down the road and have emphasized that the non-personal data aspect should be kept out of the final PDP bill. Although the government committee of experts on non-personal data governance framework also recommended the same in its December 2020 report, there is no guarantee that the same would not be included in the bill.
“While the committee recognizes that anonymity can be reversed, it provides very little information regarding how it will govern in different sectors. This draft presents an opportunity to recommend, if not prescribe, that there should be a minimum standard for the anonymization technique and the need for a governance mechanism for anonymized data sets, ”Rizvi said.