In May, the Swiss Bankers Association (SBA; German: Schweizer Bankiervereinigung SBVg) published a guide on “Handling data in day-to-day business”. The document outlines six specific use cases to illustrate the existing risks and responsibilities when creating value from data. In particular, finding the balance between “seizing opportunities” and “avoiding risks” is a significant challenge for companies. We are using this guide as an opportunity to shed light on the topic of “handling data.”
“Data, the new gold” – In the past several years, companies have been intensively investigating the opportunities that arise from the use of personal data. This opportunity raises the following questions for our customers:
- How may I use data? What precautions do I need to take?
- How should I use data? Which use cases are suitable for me?
- How can I use data? Are the organization and the technology ready?
In this article, we will look at the first question in detail.
How may I use data?
The SBA’s guide looks at this topic in detail and from the various perspectives of contract law, criminal law and data protection law. In summary, I conclude “yes, but”. It is essential to handle issues carefully and with appropriate consideration of risks. Two points, in particular, caught my attention.
Anonymization, pseudonymization, encryption
These three procedures were mentioned in Annex 3 of the FINMA Circular 08/21 as examples of protection against customer data. Since then, I have encountered these points repeatedly in discussions with customers.
What is the difference between anonymization, pseudonymization and encryption? All three processes aim to protect personally identifiable information:
- Anonymization is the process of removing personally identifiable information entirely from data sets.
- Pseudonymization is the process of replacing personally identifiable information with artificial identifiers or codes.
- Encryption is the encoding of data so that it is not recognizable as plain text. Authorized recipients can decrypt the information only with the aid of a suitable key.
The application of such methods is not trivial. On the one hand, re-identification should be ruled out. On the other hand, in many cases, data analysis only makes sense if personal data is also analyzed – for example, in data sets with low resolution, the analysis of age in combination with the place of residence would allow conclusions to be drawn about the person. Thus, pseudonymization and anonymization are not applicable in many analyses. When using encryption, also known as coding, a distinction must be made at which point in time data is encrypted and when data available in plain text.
- Data-in-motion are data during transmission from one point to another. This transmission can be on the company’s internal network or to an external location on the Internet. Data can be encrypted here without any problems within the application.
- Data-at-rest are stored data, for example, in a database or a file storage device. Encryption here is conceivable but may restrict the use of the data.
- Data-in-use are data that are actively used by an application, for example, during data evaluation by the analytics solution. In this step, the application using the data must have the data in plain text to extract/decode information from it.
The examples show that the methods of anonymization, pseudonymization and encryption are, at best, only partially applicable in the context of data analysis.
In order to be able to use data, a transparent risk assessment must be carried out. I also think it’s important to include the data-related risks in the usual corporate risk process and regularly re-evaluate them. Ultimately, the company’s appetite for risk is the determining factor.
Use of artificial intelligence (AI)
I found it very interesting that the SBA includes the topic of artificial intelligence, AI. This is a branch of computer science that tries to teach computers intelligent behavior in an automated way. Many companies are daring to take their first steps with corresponding use cases, and some are already using AI-based solutions productively. If the focus is placed on the benefit, the risks are often forgotten, starting at the drawing board through to established solutions.
We have already been able to review some AI solutions and have developed a framework for this purpose. We pay special attention to the following:
- Data governance should be established, especially in the area of data quality. A human operator can filter out erroneous data using common sense; this is not (automatically) possible using AI.
- Regulatory aspects, which may be regulated by organizational measures, must necessarily be technically intercepted in the AI environment.
- A transparent purpose of use is the prerequisite for the purpose limitation of the data, which is a key topic in data protection. At the latest for revoking consent – the right to be forgotten – it is central to know for what purpose the data was processed. It is also foreseeable that regulations in the AI environment will adopt such directories.
- Questions about controllability examine whether decisions are made directly from AI algorithms and to what extent these are reversible. Ideally, the algorithms and models will be regularly checked for biases and possible discriminations.
- Transparency is important to make decisions comprehensible. The extent to which those affected (e.g., applicants or customers) are and must be informed about the use and background of AI should also always be discussed here.
AI solutions require special supervision because they also involve extended risks. In addition to technical and regulatory issues, ethical aspects and reputational risk also need to be considered.
With the six use cases in its guide, the SBA provides some insights into which topics are currently possible. However, I often see a more straightforward starting point not in the customer-oriented topics but in the analysis of data in internal processes. Again, this raises several questions about risks, but they are easier to control.
In a future post, I will discuss how an organization can identify and prioritize possible use cases and what framework conditions should be met for data to be used.