THANK YOU FOR SUBSCRIBING
Securing Data During a DevOps Lifecycle
By Graeme Forbes, Principal Consultant, Infinity Works
To me an Amazon account is three things:
1) A financial risk unit
2) A business risk unit
3) A technical risk unit
Why does this matter when dealing with data? Let’s go back to why that question was asked, it was asked because of a small piece of legislation called GDPR.
We had to collect all the data in a large organisation from source and categorise it, model it, then make it available to the business in exploitable formats. Think reporting, analysis and data science. Some of this data would be PII (Personally identifiable Information), some would be commercially sensitive, or under a legal contract which defined its audience. The majority would be BAU data which the analysis community routinely used. So in terms of broad risk these are three separate levels. They also have three different governance requirements and, potentially, different lifecycles. In DevOps terms’ managing different lifecycles, owners, use cases, and access requirements in one account quickly becomes nightmarish, especially if the engineers aren’t cleared to see parts of the data flowing through the account. There are also the built in API limits of AWS services so having a lot of ETLs (Extract Transform Load) running in once place could easily exhaust one, CloudWatch as a prime example. So I had to come up with something that could satisfy the risks around who could see what and allowed a business owner to actually own their data and technically own their business area. It also had to be supportable.
Each account will also contain the necessary ETLs to push data into the central curated data account with the relevant DQ (Data quality) rules, tags, and anonymisation processes. With an anonymisation process there is inevitably the requirement for a de-anonymisation service with a suitable RBAC (Role Based Access Control) model. This sounds like the perfect candidate for the shared services account on the left side of the “break glass” wall.
A service can be built to extract the data necessary to reduce the business and financial risk around GDPR, under the control of the business department accountable for that risk
Haven’t mentioned that yet, have I? Let’s go back to that PII account which has very strict governance around who and what can access the raw data. We’ll treat it as behind a “break glass” wall and only allow time based access which is activated by a business process. It will feed all its logs and actions into a “break glass” audit account under the watchful eyes of governance. Commercially sensitive accounts, as we can have many of these based on their risk category, can choose to be behind “break glass” or not. As an example you might not need “break glass” for the day to day of insurance sales data but you will around that secret merger agreement. Once the merger is over you can archive the account (a unit of business risk) until its delete date (defined by law) when it can be closed.
Now we can give each business owner a production and a non-production account as a minimum and call that an allocation. This allows us to mitigate technical risk around deploying infrastructure and allow production et al to be handed over in the future. These accounts will be read-only trusted satellites of the curated data account, manage ETL process(es) responsible for pushing data into a data warehouse and publish their metadata into a data dictionary. As a DevOps engineer I’m not too worried about how, just that it’s possible. They will also be linked to a write-only audit account with optional links to a shared services account. The business can have as many allocations as it needs and can account for them financially however they like. Switching off an account because of overspend should just affect that business owner and the risk they have assigned to that allocation. If there is a security breach the blast radius should be limited to the single account and whatever accepted risk that access grants.
So back to the real question at hand, SAR (Subject Access Requests) and Right to be Forgotten. With this approach all the information for dealing with that risk now exists in one account behind time based access control, heavily audited and governed, and hopefully without an internet gateway. A service can be built to extract the data necessary to reduce the business and financial risk around GDPR, under the control of the business department accountable for that risk.