Shekar Hariharan
December 12, 2022
Enterprise data is a tremendous asset, but did you know that it could also be a cause of great data privacy-related financial risks? With local data privacy laws such as GDPR being strictly enforced by countries around the world, companies are seeing heftier fines for data breaches. Fines issued for GDPR non-compliance increased sevenfold in 2021, from $180 million in 2020 to under $1.25 billion in 2021 (Source: Securityweek). According to the third 2021 Insider Data breach survey commissioned by Egress, 94 percent of organizations have had a data breach in the last 12 months. It is just a matter of time before a company is confronted by an expensive and painful data breach.
According to IBM’s annual Cost of a Data Breach Report, the average cost of a data breach has increased from $3.86 million in 2020 to $4.35 million in 2022. This means that companies need to be extremely cautious about how they manage privacy risks by carefully controlling access to sensitive personal data.
With enterprise data growing at a fast clip, organizations are accumulating and storing terabytes of customer and employee data and moving much of the data to the cloud and accessing it as needed on a real-time basis. The more the data, the greater the risk. But what are companies doing about the inherent risks and costs associated with managing that data?
For this discussion, consider for a moment that an enterprise has decided to store its data consisting of 40 billion rows on the Snowflake data cloud (the data could be stored on any other cloud, but the issue remains the same). To understand the overall risks and costs associated with this data, we must understand the size of the risk and then determine the impact by factoring in the probability of a breach event. So, let us go ahead and do this analysis in a stepwise manner.
First, we need to understand the size of the data privacy risk based on the extent of enterprise data. The size of the risk is directly proportional to the amount of enterprise data held by an organization. Assuming that a company is based in the United States, Snowflake storage costs1 begin at a flat rate of USD $23 per compressed TB of data stored per month, which translates to USD $276/TB/year. Based on our experience and analysis, a 1 TB of Snowflake database will have roughly 6.1 billion rows. So, if a company spent $1 on storage, it could store 22.1 million rows – see table 1 below for the computation.
Based on our experience, we have observed that roughly 3-5% of the total enterprise data is personal (PI) data and about 1% of data is highly sensitive Personal Identifiable Information (PII) data. Now, let us make a conservative assumption that this 1TB of data contains 1% of Personal Information (PI) and 0.1% of Personal Identifiable Information (PII). This equates to 220k rows of PI and 22k rows of PII.
1 How Usage-Based Pricing Delivers a Budget-Friendly Cloud Data Warehouse
Next, let us assess the data privacy risk of the risk for 1TB of data. Risk estimates depend on two factors – the size of the impact of an incident and the likelihood of occurrence of an incident.
Recent IBM breach report studies show that 1 record costs $181 in breach-related costs. So, a 1TB database having 220k rows of PI and 22k rows of PII would translate to roughly $5.46 million in breach-related costs (contact us to find out more on our calculation methodology).
The next question is, what is the likelihood of the occurrence of a breach? Various studies and research put these numbers in a wide range. For this analysis, we used the Journal of Cybersecurity’s 2016 report estimate of a 4.5% chance of a breach of such magnitude could happen in an organization.
Using the above framework of the size of impact times likelihood, we calculate the total risk to be approx. USD $247k from the data contained in storage that costs $1.
Finally, let us consider a scenario where a company sitting with 40 billion rows of enterprise data is worried about meeting compliance, getting hacked, or being penalized for privacy violations. This is a real-life scenario of a USD $70 billion Asset Management company that approached us when it took stock of its data governance, compliance, and data privacy risks. To calculate the cost associated with the process of determining the effort needed to perform a risk assessment, the company would traditionally need to hire a team of data engineers and commence a data audit process that would take 8-9 months. We used the estimates provided by this company and assuming an average salary of $200K per engineer, the calculations led to a total cost of USD $180k to perform a data audit and compliance of 40 billion rows (or about $100 for every $1 of storage).
Table 2: Effort to perform data compliance audit assessment with a team of engineers (estimates provided by the company's internal team)
Considering that this process needs to be repeated every quarter, the company would be looking at an annual cost of performing compliance assessment at USD $720k. (Note: in this analysis we are not factoring in the data stored in additional instances such as test and sandbox instances, which would also hold personal data – so in total likelihood the total cost could well exceed USD 1 Million for this company, notwithstanding the additional time and effort).
With the company’s enterprise data constantly expanding, the above audit process must be repeated periodically to adhere to compliance requirements, as determined by the company’s policy and local laws – so the costs and effort will increase over time. It is obvious that this process is cumbersome, time-consuming, and not scalable.
Companies are facing the growing challenge of having to carefully manage sensitive and personal data. It is evident from the above analysis that companies are sitting in a minefield of risk associated with their enterprise data. The specific customer example above illustrates the extent of the risk from both a data privacy risk perspective as well as the time, effort, resources, and cost needed if a company were to manage privacy risks carefully.
The strategy of employing a team of engineers is simply not a scalable approach since companies need to assess risks instantly, as opposed to taking months to do a risk assessment and then repeating the process periodically. Protecto recognizes this pain and offers a solution that provides instantaneous and intelligent insights into where the risks prevail. In addition, the quick discovery of risks will enable companies to undertake expeditious responses to address those risks thereby companies can preempt having to pay costly fines. As a bonus, companies can also save costs by eliminating the need to hire a team of data engineers to perform this tedious task. Moreover, the real-time insights will also accelerate compliance reporting.
Do you know the extent of your data privacy risks? Do you have a way to ascertain this quickly? If you are unable to answer the above and are interested in doing a personalized risk assessment of your enterprise data, irrespective of which cloud data storage you use, contact us today for a free risk assessment. We will even do a demo to discuss how Protecto can help you uncover your data privacy risks. It would be prudent to act now before being confronted by costly privacy fines. That’s the last of the headache you would want to get into.
We take privacy seriously. While we promise not to sell your personal data, we may send product and company updates periodically. You can opt-out or make changes to our communication updates at any time.