5.1. Medical Data Categorization
To summarize medical findings or assign a certain observation to a body region, medical data categorization is needed. This can be helpful for a researcher or doctor to describe sets of data needed for the research or treatment instead of defining specific observations. For the patient medical data categorization offers, there is a way to hierarchically sort data in a more comprehensible way rather than a list of all observation. While many medical data categorizations offer ways to locate findings to body regions, it is still an expert system that requires basic medical knowledge. This may not be the optimal solution for every patient, so more research is needed in the field for an optimal patient interface to work with medical data. However, data categorization is needed for Dynamic Consent to provide the dynamic part of only allowing specific data access, and for the proactive part to enable proactive access to a whole category of data such as, for example, heart disease-related observation.
The two most common types of data categorization are International Statistical Classification of Diseases and Related Health Problems (ICD) by the World Health Organization (WHO) and SNOMED CT. The current version of ICD is ICD-10 (
https://icd.who.int/browse10/2019/en (accessed on 11 January 2022)), which was introduced in 1994. ICD is a classification system for medical diagnoses and health problems. ICD-10 is revised regularly and was recently extended with classifications related to the COVID-19 pandemic. It categorizes diseases and health problems in 22 chapters and contains around 14,000 different codes.
Figure 3 shows an excerpt of chapter 22 which is also used for the emergency listing of COVID-19. The finest division of an ICD-10 code is a combination of one capital letter and 3 digits (e.g., U07.1 for COVID-19). There are country-specific extensions of ICD and the WHO provides an application programming interface (API) for automated queries on the classification.
Another type of categorization is SNOMED CT which has the aim to represent clinical contents as unambiguously and as precisely as possible. It is the most comprehensive, multilingual clinical healthcare technology in the world, with more than 350,000 concepts. As for ICD, there are country- and language-specific versions of SNOMED CT. The terminology consists of three core components: concepts, descriptions and relationships. A concept is a numerical code identifying a clinical term organized in hierarchies. These codes contain textual descriptions of the specific term. There exist different types of relationships between concept codes such as “associated”, “due to”, “has focus” or “is a”.
Figure 4 presents a visualization of the hierarchical structure for COVID-19 in the SNOMED CT terminology. It remains to be noted that each hierarchy also has other elements (e.g., influenza under viral diseases). SNOMED CT also provides a rich API through the Expression Constraint Language (ECL) to define queries for SNOMED CT codes and classifications.
Table 4 presents a comparison of the advantages and disadvantages of SNOMED CT and ICD-10. ICD-10 provides a relatively small categorization with a low hierarchy depth. Therefore, it seems easier to understand and handle. Additionally, it is available in printed and electronic versions. On the downside, it is not complete and has a fixed granularity. The hierarchy does not follow any logical aspects and is mono-hierarchical. There are also no cross-references, e.g., there is no connection between a disease and symptoms. SNOMED CT seems large and complex. The hierarchies can be very deep, so it must be said that SNOMED is not human-readable without digital technology. On the positive side, it is the most comprehensive medical terminology available. It has a variable granularity and offers relationships and cross references between concepts. This is also offered in a multilevel structure. Furthermore, it also includes the ECL query language for a powerful search in SNOMED CT. Following this comparison and the potential advantages of SNOMED CT compared to ICD-10, SNOMED CT is used for the implementation of Dynamic Consent in this paper.
5.2. Dynamic Consent with XACML
As foundation for the Dynamic Consent XACML enforcement, a formal enforcement model is needed. To enable hierarchical requests to categories, functions are required to search through the SNOMED CT hierarchy. The functions descendantOf, ancestorOf and parentOf are based on actual ECL functions. With and , a list of all descendants or ancestors can be created for a SNOMED CT code. Additionally, the function lists the parent of a SNOMED CT code. It remains to be noted that a SNOMED CT code can have more than one parent.
Definition 18 (Hierarchical Functions). The hierarchical functions reflect the parent children relationship between different SNOMED CT codes.
,
where
The consent decision is made with the function , which returns or for a given SNOMED CT code. For the evaluation of the consent decisions, a root element must be set as SNOMED CT code (). This can be the highest SNOMED CT concept or any other code. In general, there are three cases.
- 1.
No policy exists for the given code, and it is the root element then is returned;
- 2.
A policy exists for the given code, the available policy is evaluated and the corresponding consent decision () is returned.
- 3.
Otherwise, will be called recursively for the parent of the given code.
The function checks whether a policy exists for the given code by checking the defined set () or the set ().
Definition 19 (Consent Decision). The consent decision returns the access decision for a given SNOMED-CT code.
The request function gets a tuple with the researcher and the requested categories and returns a set of different tuples. These tuples are the findings for which a patient has given consent to sharing.
Definition 20 (Request). The request function models a researcher request for data.
The combination of those functions results in the Dynamic Consent enforcement.
Definition 21 (Dynamic Consent Enforcement).
Let .
The first case describes when no finding exists under the requested category yet. Therefore, the function
is only executed for the requested category. Otherwise, the function
is executed for all findings of the given category. To summarize the cases where permission gets granted, the following is defined:
Permission is granted, either for the case that there exist no findings under the requested category, and the first policy of a predecessor of the requested category is a policy, or the category itself has a policy (Case 1). Otherwise, the first policies of a predecessor of all findings are all policies or they have a policy themselves (Case 2).
With this formal consent enforcement model XACML policies can be created. XACML is an attribute access control language where access decision is made according to the given attributes of the requested resource and the requesting subject [
29]. For more details on XACML, this paper refers to the original sources. In addition, the here-presented policies are written in the XACML dialect Abbreviated Language For Authorization (ALFA), which is used to write shorter and more comprehensible XACML policies [
30]. Listing shows the policy structure.
Listing 1. Final policy structure in ALFA syntax. |
|
Since the considerations are made per patient, a patient is set as a root policyset. With target clause, the targets get checked. This includes the patient, researcher or the SNOMED CT code. According to , the permission is initially denied, so all rule- and policy-combining algorithms are set to denyUnlessPermit. Since a consent has separate sets for and per researcher, every researcher is set as their own policyset and the separate sets as a policy. A policy contains the SNOMED CT codes (the identifiers in Listing act as placeholders for SNOMED codes in a real policy) as targets, for which the respective consent applies. Finally, the consent decisions are evaluated through the rule which is set in the policy. The necessary targets are defined as attributes in an additional file.
5.3. Implementation
As with CPIQ, the Dynamic Consent system is implemented in the prototypical research interface. Through the usage of SNOMED CT and XACML, a more sophisticated architecture is needed, which is shown in
Figure 5.
There are two parties in the scenario: the patient and the researcher. All systems that store data or manage the access are located at a research data center which is shown by the dotted line. The prototypical research interface is the access point for the researcher where they can post requests and get the corresponding findings. As data server, a HAPI FHIR (
https://hapifhir.io/hapi-fhir/ (accessed on 11 January 2022)) server is used which stores the patient data. HAPI FHIR is a state-of-the-art server implementation for the medical data format FHIR [
31]. As a tool, the patient used the previously mentioned “Patient App” which is extended for use of Dynamic Consent. With this app, the data on the FHIR Server can be viewed and consent policies for research usage can be created. Those policies can be stored on the AuthzForce (
https://authzforce.ow2.org/ (accessed on 11 January 2022)) server which is an open-source implementation of a XACML enforcement system. To add a new consent decision, medical data or categories as SNOMED, CT codes can be selected in the “Patients App” by explicitly setting the decision to
permit or
deny for the given code. To obtain the medical categories and hierarchical information for the findings of a patient the SNOMED CT terminology server Snowstorm (
https://github.com/IHTSDO/snowstorm (accessed on 11 January 2022)) is used, which can be used with ECL (
https://confluence.ihtsdotools.org/display/DOCECL (accessed on 11 January 2022)) queries. The corresponding consent decision will then be stored as XACML policy on the AuthzForce server. A researcher can now make research requests via the prototypical research interface, to request medical data from patients with the specification of a SNOMED CT code. The research interface now uses the FHIR server, AuthzForce and Snowstorm to obtain data which the patient has consented to share. The researcher receives a list of the patients and their corresponding findings as result.
Figure 6 shows the implementation in the “Patient App”. The screen in
Figure 6a displays the start view of the dynamic consent process. Here, the user can select the researcher for the consent and can open the category view to make their consent decision. This category view is shown in
Figure 6b where the user sees their findings in a hierarchical tree. Certain findings or whole categories can be chosen to be shared or explicitly denied from sharing. When the user has made their decision, the consent must be submitted, as shown in
Figure 6c. Finally,
Figure 6d shows the consent details of the consent for a specific researcher or research project. The numbers are SNOMED CT codes. For a real-world application, those codes should be replaced with the human-readable description of the code.
5.4. Evaluation
To evaluate the here-presented Dynamic Consent, it is evaluated against requirements that are derived from the properties of Dynamic Consent, as mentioned in
Section 3.1. Additionally, the GDPR sets requirements for informed consent which are also examined.
Table 5 shows the properties of Dynamic Consent and the status of implementation in the presented technology. A checkmark (√) indicates that this property is working properly. A dot (◯) shows that this property is not implemented.
Dynamic Consent requires that it be possible to grant access to a broad sample of data (
DC.1). This can be allowed using the category selection in the category view, which is shown in
Figure 6b.
DC.2 requires that a consent can be modified dynamically. The “Patient App” allows users to modify their consent at any time. This also satisfies
DC.10 because the consent modification occurs in real-time. The changes also come into effect immediately (
DC.11) because every request goes through the AuthzForce server, which has always up-to-date polices. In addition, a Dynamic Consent requires that a consent can be given (
DC.3) and can be taken back (
DC.4) for every single finding individually. The category view also enables this functionality since a single finding can be permitted or denied for a consent. Furthermore, it should be possible to receive an overview of all shared data for certain authorized parties (
DC.5). As
Figure 6d shows, this is also possible with the “Patient App”. Other requirements are that the purpose of the research or data usage should be always clear (
DC.6), that there should be a transaction log on data usage (
DC.7), that participants can choose the level of information about the data usage (
DC.8), and that a researcher has ways to inform patients (
DC.9). While those are requirements for Dynamic Consent, these are not necessary enforcement specific requirements. Therefore, they were considered out-of-scope. However, with the technology of the “Patient App”, most of these properties can be implemented. For example, a dedicated feedback function can fulfill
DC.8 and
DC.9. Alternatively, a general data usage log can give the functionality required for
DC.6 and
DC.7. Finally, the “Patient App” and the corresponding technology architecture can be considered as modern technologies so
DC.12 is fulfilled.
Table 6 shows an overview of the GDPR requirements for informed consent.
Article 4 No. 11 GDPR requires that a consent be the freely given wish of the subject’s data sharing preference (Reg.1). As a patient can only add consent themself, the consent is freely given. Furthermore, with the explicit indication of permit and deny, the patient signalizes whether the consent is allowed or not. Reg.2 and Reg.3 set requirements regarding the withdrawal. It needs to be possible at any time and as easy as giving consent. Through the consent management system in the “Patient App”, those requirements can be considered fulfilled. Withdrawal at any time is possible either completely or on a fine granular level. Furthermore, it uses the same system as giving consent, so it should not be more complicated. In Article 5(1)(b) GDPR, it is required that the purpose of a declaration of consent is limited (Reg.4). While this can be favored by the implementation this is not enforced on a technological level. The last GDPR requirement derived from Article 9(2)(a) requires a consent to be explicit for one or more specific purposes (Reg.5). As with the previous requirement, this can be favored by our implementation by giving each consent a choice of purpose or a static one, but it is not enforced technologically yet.