A Comparative Assessment of Threat Identification Methods in EHR Systems

This paper describes threat identification process for electronic health records systems. Identification of threats is the first step in developing more secure health care applications and it is especially useful if done in the initial phase of the application development. The recent increase in number and severity of attacks on health care applications show, how important is to focus on cyber security. The process of threat identification is explained with three different tools for threat identification: attack tree diagram, data flow diagram and process flow diagram. Then the three most common threat models are explained in relation with the electronic health records systems: STRIDE, CIA and LINDDUN. The comparison of the three treat identification tools is practically demonstrated on an use case scenario of an electronic health records system which is currently being piloted in Ghana and Indonesia. For data flow diagram, Microsoft Threat Modeling Tool is selected to automatically generate threats using the STRIDE model. Generated threats are then compared to the threats identified manually with the attack tree diagram. The process flow diagram is utilized to visualize users’ interactions with the electronic health records system. Based on the identified threats, counter measures are suggested to limit the vulnerabilities of similar electronic health record systems.


INTRODUCTION
In recent years, the healthcare industry has witnessed an exponential growth in the use of information technology.Hospitals and healthcare facilities have implemented electronic health record (EHR) systems, patient portals, and other digital technologies to improve patient care and streamline administrative processes.However, with the increased reliance on digital technology comes the increased risk of cyber threats [15] as has been shown in recent attacks such as NHS ransomware attacks in 2022 and 2017 [12], Barcelona hospital attack in 2023 [14] resulting in canceling 150 operations and 3000 patient checkups.Cybersecurity in hospitals has become an increasingly important issue, as these facilities store sensitive patient data that is vulnerable to attack.
Hospitals are particularly vulnerable to cyberattacks due to the high value of patient data, the complex interconnectedness of medical devices, and the need for 24/7 access to critical systems.The consequences of a cyberattack on a hospital can be devastating, including compromised patient safety, financial losses, and reputational damage.Furthermore, a cyberattack on a hospital can have wider implications for public health, as it can disrupt the availability of essential healthcare services and can lead to deaths as was the case of the first ever reported death cause by a ransomware attackon Düsseldorf University Clinic in 2020 [2].
Despite the risks, many hospitals have been slow to adopt comprehensive cybersecurity measures.This is partly due to the high cost of cybersecurity solutions, but also because of a lack of awareness and understanding of the risks involved.In addition, the healthcare industry has been known to lag behind other industries in terms of cybersecurity practices.
The purpose of this paper is to compare different threat identification methods for EHRs, and to identify best practices for selecting and implementing these methods in a healthcare setting.By doing so, the paper aims to contribute to the ongoing efforts to enhance cybersecurity in healthcare and protect patient data from cyber threats.

RELATED WORK
A systematic review of security and privacy in EHR systems [4] conducted on 126 papers identified that there is a lack of research focused on both security and privacy areas at the same time.A similar review conducted from 55 studies was done in [18].The paper identified 13 essential security features for EHR systems and provided recommendations for EHR systems mostly in privacy, security, scalability, interoperability, and availability areas.
An attack tree methodology for identified threats in EHR systems was firstly explored in [7].The paper presented general EHR architecture and proposed three main attack compromising vectors: client, server and network.The attack tree identified 42 attacks.The paper also proposed a method for attacks' prioritization using attack cost and probability of detection as quantitative metrics, but those were not implemented on the attack tree.Finally, 9 countermeasures were suggested.
A STRIDE-Based threat model for Telehealth systems was proposed in [5].The paper conducted the entire process of the threat modeling -assets identification, trust levels definition, data flow diagram, threat identification and finally mitigation plan.In total, 25 threats were identified and 11 counter measures were proposed.
Security techniques for the EHR were summarized in [11].The paper analyzed 25 journal and reviews and defined three main security safeguard themes: physical, technical, and administrative.The paper emphasized the importance of constantly improving security of EHR systems to cope with the emerging threats.
Finally, a feature comparison of two most common threat modeling tools -Microsoft Threat Modeling Tool and OWASP Threat Dragon was done in [9].
While the current research explores general security and privacy aspects of the EHR systems, our paper is unique in showing threat identification methods on a practical use case which is being used in the real world.

Health Care Cybersecurity Regulations and Standards
There are several different regulations regarding health care cybersecurity [8].The most important ones which should be considered for every new EHR system are: • GDPR (General Data Protection Regulation) -standard focused on any personal (identifying) data includes the following three types of health data: medical records, genetic data and biometric data [6].• ISO/TS 14441:2013 -defines security and privacy requirements for electronic patient records and guidelines and best practices for conformity assessment [1].• The HIPAA Privacy Rule -defines standards to protect individuals medical records through three safeguards: technical, physical and administrative [3].

THREAT IDENTIFICATION METHODS IN EHR
Threat identification methods are especially important in healthcare domain as they allow to prioritize security efforts and allocate resources more effectively in traditionally very limited budgets.
Health sector is also subject to strict security, data protection and privacy regulations as shown in section 2.1 and compliance with them might be required.This section describes threat identification methods from the perspective of an EHR system.

Process
Threat identification is an important part of many security frameworks [22].For example in OWASP's Software Assurance Maturity Model version 2 (SAMM) [17], the step "Threat Assessment" includes threat modeling.
Threat modeling is a structured approach for identification, quantification and prioritization of security risks in a system [10].The goal of threat modeling is to proactively identify and address potential security risks before they can be exploited by malicious actors.
Accoording to [20] threat modeling steps usually include: (1) Application decomposition -the first step involves gathering information about the system and their presentation in form of a data flow diagram (DFD).
(2) Threat identification and quantification -threats are identified and classified base on one of the threat models (STRIDE, PASTA, ... ).The risk is then quantified based on a risk assessment model such as DREAD or using a formula such as: (3) Countermeasures and mitigation -involves identification of countermeasures and classification of the current mitigation state (non-mitigated, partially mitigated, fully mitigated).

Tools
There are three main tools for threat identification as described below.

Attack tree diagram.
An attack tree diagram is a graphical representation of a system and it is used to identify and analyze potential attack vectors that an attacker might take to compromise the security of the system.The attack tree diagram is based on the idea that an attacker must complete a sequence of steps or conditions to achieve the goal, and each step or condition can be represented as a node in the tree.The root node represents the ultimate goal of the attacker (gaining unauthorized access or stealing sensitive data).The tree branches out into sub-trees, each representing a different attack vector that an attacker might take to achieve the goal.Each node in the tree represents a specific condition or step in the attack, and the nodes are connected by logical operators such as "AND" or "OR", to show how the conditions are interrelated.

Data Flow Diagrams.
Data flow diagrams show a high-level visualization of the system operation.The main focus is on the data movement in the system as opposite to how the users interact with the system.Data flow diagrams describe the system from the general perspective and do not consider the attacker's viewpoint.They therefore lack the ability to analyze possible attack vectors and entry and exfiltration points [19].Using data flow diagrams for threat identification requires security expertise and extensive discussions between the system developers and operators.

Process Flow Diagrams.
Process flow diagrams map the system from the users interaction perspective similarly like the use case diagrams.They describe each user's actions in getting to the required goal.Process flow diagrams capture interactions of each user and are therefore representing potential attack vectors into the system from the attacker's perspective [19].

Threat models
Threat models define methodology of threat identification and classification.This section evaluates the three most commonly used ones.

STRIDE.
The STRIDE model defines 6 threat categories, each violating a specific cybersecurity property as follows [10]: • Spoofing -a false identity is used, violating the authentication property.This can be misuse of a doctor / admin account to access sensitive patient's data.• Tampering -unauthorized modification of data, violating the integrity property.This can endanger patient's life if the medical records are not correct.• Repudiation -performing of a malicious action is not associated with the attacker, violating the non-repudiation property.This can hide attackers actions and make the attack detection and recovery procedures more difficult.• Information disclosure -exposure of a sensitive information, violating the confidentiality property.This can include sensitive patient's data.• Denial of service -making the information unavailable for legitimate users, violating the availability property.This can lead to postponed medical treatment endangering patient's life.• Elevation of privilege -getting more user privileges than associated with the account, violating the authorization property.This can lead to more serious attacks.

CIA.
The CIA model is the basic model used in information security and it describes the following three security properties: • Confidentiality -keeping sensitive and private information safe from unauthorized users who might try to exploit weak system configuration or users behavior (phishing).Confidentiality is mostly critical with sensitive patient information.Confidentiality is being achieved through encryption, strong authentication and training against phishing.• Integrity -protecting the information against unauthorized modification.Such a modification of patients data could lead to serious injury or dead.Unauthorized modification can happen either when data is stored, or during transit (manin-the-middle type of attacks).Protection includes storage encryption and use of encrypted communication protocols.• Availability -keeping the data continuously available to authorized users.Making patients data unavailable can lead to postponed operations and treatment jeopardizing patients health as was the case in recent ransomware attacks in Barcelona [14].Unavailability can be caused not only by an attack (denial of service or ransomware attack), but also by configuration mistakes, hardware faults (including power supply), or natural disasters.Protection involves redundant hardware (virtualization), proper backup strategy (3-2-1), use of cloud computing and effective monitoring tools.

LINDDUN.
Linddun [13] is a framework with three main steps: model the system, elicit threats and manage threats.The framework divides the threat to the following categories: • Linkability -data objects can be be linked because they share the same property.This can lead to privacy violation especially when a person can be identified (similar disease, etc.).• Identifiability -data items can be linked to the identity of the data subject.This increases a chance of sensitive data leaks (data about patients).• Non-repudiation -is a security goal preventing a subject to deny an action.This requires thorough implementation of a logging and monitoring system which keeps track of all actions performed in the EHR.• Detectability -an ability to determine if the data exists or not without a need to access it.This can cause a privacy issues, for example when somebody determines the patient diagnose based only on an entry in a specific hospital department.• Disclosure of information -getting access to personal information.This can represent patient data.• Unawareness -lack of transparency when dealing with personal data.This includes processing data of patients who are not aware of it.• Non-compliance -not complying with data protection principles.This might be a violation of patient privacy, or collecting and processing unnecessary data.

USE CASE: EHR SYSTEM IN GHANA AND INDONESIA 4.1 The EHR system introduction
We have selected an electronic health records system (EHR) which is currently being piloted in Ghana and Indonesia.This system is web-based and developed with python-django framework APIs in the backend as shown in Figure 1.The front end is developed with React.js.This EHR system can be deployed as EHR software as a service.This means that several hospitals can use this sharing the same database but are unable to interfere with each others' data.The software also provides access to patients where the patients can log in to view reports and other functions that they are permitted to access.The modules in this EHR have patient registration, vital signs, consultation and diagnosis.Other modules including pharmacy, laboratory, radiology, procedure and admission modules have also been incorporated.Most of the modules are being used by hospitals that have adopted the system in Ghana.In Indonesia, the system was customised for use by private practice facilities.The customizable was done user user-centred design and co-creation to enhance both usability and security.The selection of these health facilities was done through convenience sampling.A maternity module which enables the operation of both antenatal and postnatal functions is being implemented.OWASP's top ten security attack dictionary was considered in the development life-cycle of this system.

Threat identification process
For the threat identification, we used the three tools described in Section 3.2.We used them in the following order: attack tree During the threat identification process, the system developers were consulted for technical details and possible weak points ensuring finding of maximal number of relevant threats.

The goal
The goal of this use case threat identification process was to demonstrate differences in described tools rather than to provide a comprehensive transferable results as each use case is significantly different.The results are also highly dependent on the discussion with the system developers.

RESULTS
This section describes the results of the three threat identification tools.

Attack Tree Diagram
The main goal of the attack tree diagram was to find ways to attack the EHR system.In order to create an attack tree diagram, the system vulnerabilities had to be identified first.We used OWASP Top 10 Web Application Security Risks [16] for web server part of the system.
The complete attack tree diagram of the EHR system is shown in Figure 2. In total, six main threat vectors were identified -malware infection, use of a logged-in client, shoulder surfing, attacks through web vulnerabilities, attacks through obtaining login credentials and theft of an electronic gadget such as employees laptop, flash disk, or smartphone.Three of these attack vectors (malware attacks, attacks through web vulnerabilities, and login credentials misuse) were then extended by several specific threats -web vulnerabilities by OWASP Top 10 Web Application Security Risks [16].In total, 24 threats were identified.

Data Flow Diagram
The main goal of the data flow diagram is to identify all threats from the perspective of data moving through the system.Microsoft Threat Modeling Tool was used in our use case as opposed to the OWASP Threat Dragon mainly due to the support of more elements and ability to automatically generate threats.This significantly speeds up the process and reduces the required qualifications of the people performing the process.
Upon drawing the diagram of the EHR system as shown in Figure 3, Microsoft Threat Modeling Tool generated 90 threats.Table 1 shows the distribution of those threats in the STRIDE model categories.
The automatically generated threats then had to be inspected and marked as one of the following options: • Not started -the default state for all automatically generated threats.• Needs investigation -a relevant threat, which needs protective measures to be applied.• Not applicable -not a relevant threat in this scenario.When compared to the attack tree diagram, Microsoft Threat Modeling Tool revealed a new threat which was not previously detected: denial of service for various resources (database, web server, network, cloud storage).It includes purposeful attacks as well as server or application crashes.
The list of generated threats also revealed use of a different terminology.Specifically this includes: • Spoofing of threats -correspond to the "broken authentication" in the attack tree diagram.• Logging related threats -correspond to the "security misconfiguration".• Elevation of privilege -correspond to "broken access control" -various attacks mostly to web server and devices' web browser in order to gain additional privilege.
Finally, the Microsoft Threat Modeling Tool can generate an HTML report that structures the threats per each interaction -a one-way data exchange between the two elements.Threats are then listed for this interaction and sorted by the STRIDE category.An example from the generated report is shown in Figure 4.This example shows an interaction between the web server and the cloud storage.For this interaction, 8 threats were generated with additional 5 threats for the interaction in the opposite direction.

Process Flow Diagram
The main goal of the process flow diagram is to map the system from the users interaction perspective.Figure 5 shows an example of a simplified process flow diagram with abstracted functionality.The full diagram is not presented due to the confidentiality reasons as it includes technical details about the EHR system functionality.
The diagram did not show any new threats than the previous diagrams, but due to its graphical representation, was useful in visual presentation of various user roles and their privileges.

DISCUSSION AND RECOMMENDATIONS 6.1 Discussion
The results in section 5 show clear differences in various tools in terms of found threats.
The attack tree diagram identified unique threats related to users (unattended clients, shoulder surfing, social engineering, etc.), physical situations (device theft, keylogging), and different ways of getting a malware infection.
Use of Microsoft Threat Modeling Tool for the data flow diagram allowed us to automatically generate 90 threats and to identify an additional important threat not found in the attack tree diagramdenial of service.
The process flow diagram did not reveal new threats but acted as an important tool for the visualization of users' interactions with the system.
The main difference between threats identified by the attack tree and the flow diagram as generated by the Microsoft Threat Modeling Tool is their scope.While the attack tree diagram identifies threats for the entire system, the Microsoft Threat Modeling Tool distinguishes each component and generates threats specific only for these two elements.This provides more technical information about where exactly the threat can happen in the system, but also tends to generate many redundant threats (which can be mitigated with one solution for the entire system).
Threat identification is the first part of the process.The next step is to classify the threats by giving them a priority determining how urgent is the need to mitigate the threat.The last step is to determine countermeasures and mitigations.

Recommendations
Table 2 shows identified threats from attack tree and data flow diagrams and presents recommended countermeasures to limit or eliminate threats' severity.For instance, regarding mitigation against malware via emails, user training, email configuration and anti-malware software need to be adopted [21].This mode of attack (phishing) has been one of the most common attack vectors, causing over 80% global security breaches in the space of the human aspect of security practice [23].

CONCLUSION
The paper described the threat identification process in electronic health records (EHR) systems and demonstrated it on a use case EHR application which is currently piloted in Ghana and Indonesia.
The use case showed differences in threat identification for different methods -the attack tree diagram, data flow diagram and process flow diagram.Each method excelled in slightly different areas.The attack tree method was most useful for finding general threats for the entire EHR.The disadvantage of the method was a relatively high requirement for knowledge of analysts and research needed to be done in order to find all the relevant threats.
The data flow diagram drawn using the Microsoft Threat Modeling Tool was most useful in automatically generating specific threats for each of the two elements in the diagram.The disadvantage is the need to filter the found threats as there might be a large number of not relevant threats.After comparing the threats with the attack tree diagram, one new important threat was identifieddenial of service.
The process flow diagram was used mostly for a high level of abstraction visualization as it would otherwise reveal sensitive information about the system.The main purpose of the diagram was to visually display activities performed by different user roles.
With more strict rules and regulations and an increased number of attacks on EHR systems, it is important to make the systems as secure as possible.Incorporating threat identification methods into the initial stages of the system development reduces the risk of an attack and can save money and time otherwise needed to fix the issues later, or to deal with an attack.From the presented results, we

Figure 2 :
Figure 2: Attack tree diagram of the EHR system

Figure 3 :
Figure 3: Data flow diagram of the EHR system created in the MS Threat Modeling Tool

Figure 4 :
Figure 4: MS Threat Modeling Tool Report

Figure 5 :
Figure 5: Process flow diagram of the EHR system

Table 2 :
Security recommendations Least privilege principle, access control mechanisms, monitor and log, fail safe Insecure deserialization Integrity checks, encryption, log Cross-site scripting XSS Safe frameworks, escape untrusted characters, context-sensitive encoding, CSP (content security policy), input filters Insufficient logging and monitoring Enable for all high-value actions, effective monitoring, alerts, incident response plan Security misconfiguration Only required features, prioritizing system for updates and patches, strong application architecture Denial of service Redundant architecture, cloud, firewall configuration can recommend attack tree diagram and data flow diagram methods for conducting such an identification.While both methods show very similar results, if the resources allow, it is advisable to use both approaches in order to lower the chance of a missed threat and to gain the maximum insight into the system's vulnerabilities.