Completing and Debugging Ontologies: state of the art and challenges

As semantically-enabled applications require high-quality ontologies, developing and maintaining ontologies that are as correct and complete as possible is an important although difficult task in ontology engineering. A key step is ontology debugging and completion. In general, there are two steps: detecting defects and repairing defects. In this paper we discuss the state of the art regarding the repairing step. We do this by formalizing the repairing step as an abduction problem and situating the state of the art with respect to this framework. We show that there are still many open research problems and show opportunities for further work and advancing the field.


Introduction
Ontologies (e.g., [1]) aim to define the basic terms and relations of a domain of interest, as well as the rules for combining these terms and relations. They standardize terminology in a domain and are a basis for semantically enriching data, semantic search, integration of data from different data sources, and reasoning over the data. Using ontologies can alleviate the variety (data sources are heterogeneous regarding the type and nature of data they store), variability (data can be inconsistent) and veracity (not all data can be trusted) problems. Furthermore, they are proposed as an enabler making data FAIR, i.e., findable, accessible, interoperable, and reusable, with the purpose of enabling machines to automatically find and use the data, and individuals to easily reuse the data [2]. Ontologies are also a key technology for the semantic web.
In recent years many ontologies have been developed (see [3] for a survey on ontology libraries). Further, ontologies have been connected to each other 1 arXiv:1908.03171v2 [cs.AI] 2 Nov 2020 into ontology networks and there are some portals that store these (e.g., Bio-Portal (http://bioportal.bioontology.org/), Unified Medical Language System (http://www.nlm.nih.gov/research/umls/about_umls. html)). However, developing ontologies and networks are not easy tasks and there may be issues related to the quality of the ontologies and networks [4]. Two such issues are incorrectness (does the ontology contain wrong information?) and completeness (is information lacking?). Ontologies containing wrong information or lacking information, although often useful, also lead to problems when used in semantically-enabled applications. Wrong conclusions may be derived or valid conclusions may be missed. As an example, in [5] it was shown that semantically-enabled querying of PubMed (http://www.ncbi.nlm.nih. gov/pubmed/) using MeSH (Medical Subject Headings, http://www.nlm. nih.gov/mesh/) with one piece of information missing (i.e., that scleritis is a scleral disease) would lead to missing 55% of the result that is obtained with this piece of information. Therefore, it is essential to complete and debug ontologies and their networks.
Defects in ontologies can take different forms (e.g., [6]). Syntactic defects are usually easy to find and to resolve. Defects regarding style include such things as unintended redundancy. More interesting and severe defects are the modeling defects and the semantic defects. Modeling defects relate to the domain that is being modeled and include such things as missing concepts and relations, or statements that are not correct according to the domain. For instance, in [7] it was shown that for the two ontologies used in the Anatomy track in the Ontology Alignment Evaluation Initiative (OAEI, yearly event for evaluation of ontology alignment systems), at least 121 and 83, respectively, is-a relations (defined below) that are correct in the domain are missing in these ontologies. Further, BioPortal contains results for the OAEI Anatomy track with ca 20% incorrect statements and ca 20% of the connections between ontologies are missing. Semantic defects relate to logical defects such as defining concepts that are logically equivalent to the empty set (called unsatisfiable concepts, formally defined below) or ontologies that contain contradictions. As an example, in [6] it was shown that the TAMBIS ontology contained 144 unsatisfiable concepts.
In this paper we review approaches for raising the quality of ontologies by repairing them. In general, completing and debugging requires two steps. In the detection step possible defects are detected. In the repairing step the detected wrong information is removed and missing information is added. In this paper we focus on the repairing step which can be formalized as an abduction problem and for which there are still many open research problems.
The remainder of the paper is organized as follows. In Sect. 2 we discuss preliminaries introducing notions related to ontologies (Sect. 2.1), description logics which are used as a basis for the formalization of the repairing problem (Sect. 2.2), and a short review of methods for the detection step (which is not the focus of the paper) and its relation to the repairing step (Sect. 2.3). In Sect. 3 we relate this survey to earlier surveys. Then, in Sect. 4 we formalize ontology repair (completion and debugging) as an abductive reasoning problem and, as there may be different ways to repair ontologies, we introduce different preference relations between solutions that are relevant to this problem. In Sect. 5 and 6 we discuss the state of the art based on our formalization for ontologies and ontology networks, respectively regarding debugging, completion and the combination of these. Further, we give some open problems related to theory and algorithms as well as regarding user involvement in Sect. 7. The paper concludes in Sect. 8.

Ontologies and ontology networks
From a knowledge representation point of view, ontologies may contain four components: (i) concepts that represent sets or classes of entities in a domain (e.g., Fracture in the example in the appendix, representing all fractures), (ii) instances that represent the actual entities (e.g., an actual fracture), (iii) relations (e.g., hasAssociatedProcess in the example in the appendix), and (iv) axioms that represent facts that are always true in the topic area of the ontology (e.g., a fracture is a pathological phenomenon). Concepts and relations are often organized in hierarchies using the is-a (or subsumption) relation, denoted by . When P Q, then P is a sub-concept of Q and all entities belonging to P also belong to Q. Axioms can represent such things as domain restrictions, cardinality restrictions, or disjointness restrictions. Many ontologies do not contain instances and represent knowledge on the concept (or 'schema') level. In this paper we do not deal with instances.
Ontologies can be represented in different ways, but one of the more popular ways nowadays is to use (variants of) the OWL language (e.g., https: //www.w3.org/TR/2012/REC-owl2-primer-20121211/). OWL is based on description logics, which are presented in Sect. 2.2. In description logics concepts, roles, individuals and axioms are used, which relate to concepts, binary relations, instances and axioms in ontology terminology, respectively.
An ontology network is a collection of ontologies and pairwise alignments between these ontologies. An alignment is a set of mappings (also called correspondences) between entities from the different ontologies. The most common kinds of mappings are equivalence mappings as well as mappings using is-a and its inverse.

Description logics
In this paper we assume that ontologies are represented using a description logic TBox. We introduce notions in the field of description logics that are relevant for this paper.
Description logics [8] are knowledge representation languages. In description logics concept descriptions are constructed inductively from a set N C of atomic concepts and a set N R of atomic roles and (possibly) a set N I of individual names. Different description logics allow for different constructors for defining complex concepts and roles. In current work on completing and debugging different logics are used such as EL (which uses the top concept , and the concept constructors conjunction and existential restriction) and ALC (top concept , bottom concept ⊥ and concept constructors conjunction, disjunction, negation, and existential and universal restrictions). We mention later also SHOIN which in addition to the ALC constructors allows constructors for transitive roles, role hierarchies, nominals, inverse roles and number restrictions. In Table 1 we show the syntax of the ALC constructors and refer to, e.g., [9] for information on other description logics.
An interpretation I consists of a non-empty set ∆ I and an interpretation function · I which assigns to each atomic concept P a ∈ N C a subset P I a ⊆ ∆ I , to each atomic role r a ∈ N R a relation r I a ⊆ ∆ I ×∆ I , and to each individual name i ∈ N I an element i I ∈ ∆ I . The interpretation function is straightforwardly extended to complex concepts (see Table 1). A TBox is a finite set of general concept inclusions (GCIs) (in Table 1) and role inclusions (RIs). An interpretation I is a model of a TBox T if for each GCI and RI in T , the semantic conditions are satisfied. We say that a TBox T is inconsistent if there is no model for T . Further, we say that a TBox is incoherent if it contains an unsatisfiable concept where a concept P in a TBox T is unsatisfiable if for all models I of T : P I = ∅.
One of the main reasoning tasks for description logics is subsumption checking in which the problem is to decide for a TBox T and concepts P and Q whether T |= P Q, i.e., whether P I ⊆ Q I for every model of Tbox T . We say then also that Q subsumes P or that P is-a Q.

Completing and debugging workflow
In this paper we review approaches for raising the quality of ontologies by repairing them. In general, completing and debugging requires two steps. In the detection step defects are found using different approaches. In the repairing step, on which this paper focuses, the detected wrong information is removed and missing information is added. Although we focus on the repairing step and most work on repairing assumes that the detection step is done, we briefly discuss detection and the connection to repairing.

Detection
There are many kinds of approaches to detect defects in ontologies and many of these are complementary. A detection method that works for all kinds of defects is inspection of the ontologies. This requires ontology development environments that provide search, reasoning and explanation facilities to aid the domain expert in the inspection.
Most detection methods for semantic defects are logic-based and focus on wrong information in the ontologies. A common strategy is to detect unsatisfiable concepts or inconsistencies in ontologies or ontology networks using standard reasoning techniques. One problem that is reported in [10] is that, as an unsatisfiable concept that is used in the definition of other concepts may make these unsatisfiable as well, description logic reasoners could give large lists of unsatisfiable concepts. One way to alleviate this problem is as in [6] where these 'root' concepts are identified (although in [6] they do this during the repairing step).
There are many approaches to find missing information. There is much work on finding candidate relationships between terms in the ontology learning area [11]. In this setting, new ontology elements are derived from text using knowledge acquisition techniques. Another paradigm is based on machine learning and statistical methods, such as k-nearest neighbors approach [12], association rules [13], bottom-up hierarchical clustering techniques [14], supervised classification [15] and formal concept analysis [16].
A much used approach is to use patterns. The pioneering research conducted in this line is in [17], where the focus was on finding missing is-a relations. The work defines a set of lexico-syntactical patterns indicating is-a relationships between words in the text. However, depending on the chosen corpora, these patterns may occur rarely. Thus, although the approach has a reasonable precision, its recall is very low. Many variants have been proposed. Lexico-syntactic patterns as well as logic patterns have been used to find wrong as well as missing information in ontologies (e.g., [18,19,20,21]) and ontology networks (e.g., [22,23]). The OOPS! system implements a variety of these [21].
There are also approaches that use knowledge that is intrinsic in an ontology network to detect defects. For instance, in [24,25] a partial alignment between ontologies is used to detect missing is-a relations. These are found by looking at pairs of equivalence mappings. If there is an is-a relation between the terms in the mappings belonging to one ontology, but there is no is-a relation between the corresponding terms in the other ontology, then it is concluded that there is a candidate missing is-a relation in the second ontology. A similar approach is used in [26].
The detection of missing mappings is a research area on its own, i.e. ontology alignment [27], and we discuss this further in Sect. 6.
We note that for missing information these detection approaches usually do not detect all defects and they do not guarantee that the found defects are really missing. The found defects are really candidate defects which need to be validated by a domain expert.

Workflow
In much of the current work we find systems or methods that detect defects or repair defects, but usually not both.
A workflow for a system for completing and debugging ontologies and ontology networks contains two main steps: detection and repair. It is well-known that for high-quality results a domain expert needs to be involved in both steps. As the defects found by detection systems usually are candidate defects, a domain expert needs to validate the candidate defects as wrong input to the repairing step would lead to wrong repairs of the ontologies. Furthermore, in the repairing step domain experts are needed to validate repairs for modeling problems. Also for the semantic defects a domain expert is needed, as systems that are purely logic based may prefer logically correct solutions that are not correct according to the domain over solutions that are (e.g., [28]). The two steps do not need to be completely separated. For instance, when repairing the ontology new information is added or wrong information is deleted and this may be used to detect further defects.
As an example, the RepOSE system [24,25] is a system for debugging and completing is-a structure and mappings in ontology networks, where only the named concepts and the is-a relations in the ontologies are considered. RepOSE has a detection step that uses knowledge intrinsic in the network to detect candidate missing is-a relations and mappings. These candidate defects are then validated by a domain expert. In the case a candidate missing is-a relation or mapping really is missing, then we need to complete, otherwise a wrong is-relation or mapping is derivable from the network and thus we need to debug. The validated and classified defects (missing or wrong) are then used as input for the repairing step. Different algorithms are used for repairing different kinds of defects, but the sub-steps for each kind are generation of repairing actions (what to add or delete), the ranking of repairs (a proposed order in which to deal with the defects), the recommendation of repairing actions (using external knowledge) and finally, the execution of the repairing actions chosen by the domain expert (with computation of the consequences of the action). The consequences can include such things as other defects are also repaired, possible repairing actions for other defects change or new candidate defects are found. Furthermore, at any time during the process, the user can switch between different ontologies, start earlier phases, or switch between the repairing of different kinds of defects. The process ends when there are no more defects to deal with.

Related work
There are early surveys from 2007 [29,30] where debugging approaches are reviewed. In this paper we introduce a framework with preference relations that in addition to debugging also includes completion, and that allows us to compare the different approaches in a uniform way. The early surveys discuss approaches where the ontologies can include instances, which we do not. Further, [29] in-troduces some criteria for debugging approaches. Regarding the criterion application the authors distinguish between different tasks. In this paper we focus on the repair task. The granularity of the repairs is on the axiom level in this paper. Regarding Tbox and Abox support, we focus on TBox support. Some algorithms focus on consistency and some on incoherence. For some of the methods, an implementation is available. Regarding support of ontology networks, we have a dedicated section to this topic (Sect. 6). User involvement is discussed in Sect. 4.1.2 and Sect. 7.2. We mention the criteria preservation of structure, complexity and exploitation of context in relevant places in this paper.

Ontology Repair
In this section we focus on repairing ontologies represented in description logics, where we have already detected wrong and missing information. We only discuss ontologies at concept level, and thus do not deal with instances (or individuals in description logic terminology). We define this problem as an abductive reasoning problem, and as a repairing problem can have many solutions, we discuss preference relations between these. Further, we use an abstract example to exemplify the notions. However, in the appendix we give an example inspired by the Galen ontology.

Repair
Definition 1 (Repair) Let T be a TBox and C be the set of all atomic concepts in T . Let M and W be finite sets of TBox axioms. Let Or be an oracle that given a TBox axiom returns true or false. A repair for Complete-Debug-Problem CDP(T, C, Or, M, W ) is any pair of finite sets of TBox axioms (A, D) such that (i) ∀ ψ a ∈ A: Or(ψ a ) = true; Def. 1 formalizes the repair of an ontology for which missing and wrong information is given. An ontology is represented by a TBox T with its set of atomic concepts C. The identified missing and wrong information is represented 8 T: {ax1: P 1 P 2 , ax2: P 1 P 3 , ax3: P 1 ¬P 4 , ax4: P 2 P 4 , ax5: P 2 P 5 , ax6: P 3 P 5 , ax7: P 3 P 6 , ax8: P 4 P 7 , ax9: P 5 ∀s.P 8 , ax10: P 6 ∃s.¬P 8 } C: {P 1 , P 2 , P 3 , P 4 , P 5 , P 6 , P 7 , P 8 } Or(X) = true for X = P 1 P 3 (ax2), P 1 ¬P 4 (ax3), P 1 P 6 , P 2 P 3 , P 2 P 4 (ax4), P 2 P 5 (ax5), P 2 P 6 , P 2 P 7 , P 2 ∀s.P 8 , P 3 P 6 (ax7), P 4 P 3 , P 4 P 5 , P 4 P 6 , P 4 P 7 (ax8), P 4 ∀s.P 8 , P 5 ∀s.P 8 (ax9), P 7 P 3 , P 7 P 6 , and axioms derivable from this list (e.g., if P Q, then also P O Q.) Or(X) = false if X is not in the list above (for true) or cannot be derived from the axioms in the list above. by a set M of missing axioms, and a set W of wrong axioms. To repair the TBox, a set A of axioms that are correct according to the oracle should be added to the TBox and a set D of axioms that are not correct according to the oracle should be removed from the TBox such that the new TBox is consistent, the missing axioms are derivable from the new TBox and the wrong axioms are not derivable from the new TBox. As an example, consider the CDP in Fig. 1 and visualized in Fig. 2. Then R 1 , R 2 , R 3 , R 4 and R 5 (visualized in Fig. 3) are all repairs of the CDP.
In general, the set of all axioms that are correct according to the domain and the set of all axioms that are not correct according to the domain are not known beforehand. Indeed, if these sets were given then we would only have to add the axioms of the first set to the TBox and remove the axioms in the second set from the TBox. The common case, however, is that we do not have these sets, but instead, we can rely on a domain expert who can decide whether an axiom is correct or not according to the domain. Therefore, in the formalization we introduce an oracle Or that represents the domain expert and that when given an For Or we identified the following interesting cases. The first case is the allknowing oracle. In this case the oracle's answer is always correct. This is the ideal case, but may not always be achievable. Most current work considers this kind of oracle. In the second case, the limited all-knowing oracle, if Or answers, then the answer is correct, but it may not know the answer to all questions. This case represents a domain expert who knows a part of the domain well. An approximation of this case is when there are several domain experts who may have different opinions and we use a skeptical approach. Only if all domain experts give the same answer regarding the correctness of an axiom, do we consider the answer. In the third case Or can make mistakes regarding the validation of axioms. Axioms that are not correct according to the domain may be validated as correct and vice versa. This is the most common case. Although most current work assumes an all-knowing oracle, recent work used, in addition to an all-knowing oracle, also ꓯs.P8 ꓱs.¬P8 ꓯs.P8 ꓱs.¬P8 oracles with specific error rates in the evaluation of ontology alignment systems [31]. A lesson learned was that oracles with error rates up to 30% were still beneficial for the systems. The fourth case represents situations where no domain expert is available and there is no validation of axioms, such as in fully automated systems.
As noted, most current work considers an all-knowing oracle. With an allknowing oracle we can check that ∀ ψ m ∈ M : Or(ψ m ) = true, and ∀ ψ w ∈ W : Or(ψ w ) = false and if this is not the case, we can remove the falsely identified defects. Therefore, we can, without loss of generality, assume that the axioms in M really are missing, and the axioms in W really are false. Furthermore, regard-ing repairs, when using an all-knowing oracle, we know that all added axioms in A are correct according to the domain and all removed axioms in D are false according to the domain. Furthermore, for an all-knowing oracle, we know that . When using other oracles, we cannot be sure that the given missing and wrong axioms really are missing and wrong, respectively. Therefore, oracles that make mistakes or do not know the correctness of all axioms may start with wrong input. Also, wrong axioms may be added and correct axioms may be removed during the repairing. These issues may have a negative effect on the quality of the repaired ontology.
In practice, when using domain experts, it is not possible to know which kind of domain expert is used. When only one domain expert is available it is reasonable for the systems to assume that an all-knowing expert is used, although we should be aware that mistakes can occur. When more domain experts are available, a skeptical approach or a voting approach may be used for raising the quality of the ontology.

Preference relations
As there may exist many possible repairs for a given CDP, and not all are equally interesting, it is necessary to define preference relations between repairs.

Basic preferences
From the completion perspective of a complete-debug-problem it is important to find repairs that add as much correct information as possible to the ontology, while from the correctness perspective wrong information should be removed as much as possible. Def. 2 and 3 formalize these intuitions respectively.
Def. 2 states that a repair R is more complete than another repair R if all correct statements that can be derived from the ontology repaired by R also can be derived from the ontology repaired by R and there is a correct statement that can be derived from the ontology repaired by R, but not from the ontology repaired by R . Therefore, if S is more complete than another repair R then the ontology repaired by R contains more correct statements than the ontology repaired by R .
Further, when the same correct statements can be derived from the ontology repaired by R and the ontology repaired by R , then R and R are equally complete.
Definition 2 (more complete) Let R = (A, D) and R = (A , D ) be two repairs for CDP(T, C, Or, M, W ). R is more complete than R (or R is preferred to R with respect to 'more complete') iff (∀ψ : Def. 3 states that a repair R is less incorrect than another repair R if all wrong statements that can be derived from the ontology repaired by R also can be derived from the ontology repaired by R and there is a wrong statement that can be derived from the ontology repaired by R , but not from the ontology repaired by R. Therefore, if R is less incorrect than another repair R then the ontology repaired by R contains less wrong statements than the ontology repaired by R .
Further, when the same wrong statements can be derived from the ontology repaired by R and the ontology repaired by R , then R and R are equally incorrect.
Def. 4 defines a classical preference relation for abduction problems related to removing redundancy using the subset relation. It compares the add and delete sets of two repairs.
As examples, for the CDP in Fig. 1 we have that R 3 ⊂ R 2 ⊂ R 1 and R 4 ⊂ R 5 ⊂ R 1 . Further, R 1 is less incorrect than R 2 , R 3 , R 4 and R 5 . R 2 and R 3 are equally incorrect, and R 4 and R 5 are equally incorrect. We also have that R 1 , R 2 , R 3 and R 5 are equally complete and they are more complete than R 4 .

Preferred repairs with respect to a basic preference
Based on these preference relations we can define repairs that are preferred with respect to one particular preference relation (Def. 5 -7).
Definition 5 (maximally complete) A repair R = (A, D) for CDP(T, C, Or, M, W ) is said to be maximally complete (or preferred with respect to 'more complete') iff there is no repair R which is more complete than R.
Definition 6 (minimally incorrect) A repair R = (A, D) for CDP(T, C, Or, M, W ) is said to be minimally incorrect (or preferred with respect to 'less incorrect') iff there is no repair R which is less incorrect than R.
As examples, for the CDP in Fig. 1 we have that R 3 and R 4 are subset minimal, R 1 is minimally incorrect, and R 1 , R 2 , R 3 and R 5 are maximally complete.

Combining preferences
The criteria regarding completeness and correctness are desirable as completeness leads to more correct information and correctness leads to less incorrect information. In most cases also the reduction of redundancy is desirable (but see below for cases where this is not the case). Therefore, we define different ways to combine these criteria. First, we need to define when a repair dominates another repair with respect to preference relations (Def. 8). A repair R dominates another repair R if R is at least equally preferred to R for each of a selected set of preference criteria and more preferred for at least one of those.
Definition 8 (dominate) Let R = (A, D) and R = (A , D ) be two repairs for CDP(T, C, Or, M, W ). R dominates R with respect to a set of preference relations P ⊆ {more complete, less incorrect, ⊂} if R is more or equally preferred to R for all preference relations in P ∧ R is more preferred to R for at least one of the preference relations in P.
Using the definition of dominate, we can now define a preference relation that combines the basic preference relations, but which prioritizes one of those. We prefer repairs that are preferred with respect to a prioritized basic preference relation, and that are not dominated by other such repairs. Def. 9 formalizes this. Definition 9 (combining with priority to one of the preference relations) Let X ∈ {more complete, less incorrect, ⊂}. Let P ⊆ {more complete, less incorrect, ⊂} \ {X}. A repair R for CDP(T, C, Or, M, W ) is said to be X-optimal with respect to P iff R is preferred with respect to X and there is no other repair that is preferred with respect to X and dominates R with respect to P.
We can also define a preference relation that combines basic preference relations, but where the basic preference relations have equal priority. In this case we prefer repairs that are not dominated by other repairs according to the selected basic preferences. Def. 10 formalizes this.
Definition 10 (combining with equal priority) A repair R for CDP(T, C, Or, M, W ) is said to be skyline-optimal with respect to P iff there is no other repair that dominates R with respect to P.
We note that if a repair is X-optimal with respect to P, then it is skylineoptimal with respect to P {X}.
As examples, for the CDP in Fig. 1 we have that R 4 is ⊂-optimal with respect to {less incorrect} and R 3 is ⊂-optimal with respect to {more complete}. Further, R 1 is less-incorrect-optimal with respect to {more complete} and more-completeoptimal with respect to {less incorrect}.
The advantage of maximally complete and more-complete-optimal repairs is that a maximal body of correct information is added to the ontology and for the latter without redundancy and/or with removing as much wrong information as possible. The advantage of minimally incorrect and less-incorrect-optimal repairs is that a maximal body of wrong information is removed from the ontology and for the latter without redundancy and/or with adding as much correct information as possible. Although these are the most attractive repairs, in practice it is not clear how to generate such repairs, apart from a usually infeasible brute-force procedure that checks the correctness of all axioms with the oracle. (Although a strategy can be devised to check all without asking the oracle for each axiom, the number of requests will still be large.) Repairs prioritizing subset minimality ensure that there is no redundancy. The advantage of removing redundant axioms is the reduction of computation time as well as the reduction of unnecessary user interaction. However, in some cases redundancy may be interesting. For instance, developers may want to have explicitly stated axioms in the ontologies even though they are redundant. This can happen, for instance, for efficiency reasons in applications or as domain experts have validated asserted axioms, these may be considered more trusted than derived axioms. Furthermore, focusing on redundancy may lead to less complete or more incorrect repairs. Skyline-optimal is a relaxed criterion. When, for instance, P = {more complete, less incorrect}, then a skyline-optimal repair with respect to P is a preferred repair with respect to correctness for a certain level of completeness, or a preferred repair with respect to completeness for a certain level of correctness. In practice, as it is not clear how to generate more-complete-optimal and less-incorrect-optimal repairs, a skylineoptimal repair may be the next best thing, and in some cases (e.g., Sect. 5.2) it is easy to generate a skyline-optimal repair. However, in general, the difficulty lies in reaching as high levels of completeness and as low levels of incorrectness as possible.

State of the art -ontologies
Most of the current work has focused on the correctness or the completeness of ontologies, but very little work has dealt with both. However, a naive combination of a completion step and a debugging step does not necessarily lead to repairs for the combined problem. In this section we discuss current work.

Correctness
When only dealing with repairing the inconsistency or incoherence of Tboxes (semantic defects), only wrong information is dealt with. Therefore, in Def. 1, M = ∅ and A = ∅. In most current approaches the domain expert is not included. This means that choices are made solely based on the logic and that correct axioms may be removed from the ontologies. Therefore, not all solutions may actually be repairs as defined in Def. 1 as requirement (ii) may not be satisfied.
There is much work on repairing semantic defects. Most approaches are based on finding explanations or justifications for the defects using a glass-box or blackbox approach [6]. A glass-box approach is based on the internals of the reasoning algorithm of a description logic reasoner. A black-box approach uses a description logic reasoner as an oracle to determine answers to standard description logic reasoning tasks such as checking concept satisfiability or subsumption with respect to an ontology.
A general approach for repairing incoherent ontologies is the following (adapted from [32]). (For inconsistent ontologies we can use a similar approach.) For a given set of unsatisfiable concepts for an ontology, compute the minimal explana-tions for the defects, i.e., the minimal reasons for the unsatisfiability of concepts. These minimal reasons for the unsatisfiability of a concept are sets of axioms and are called minimal unsatisfiability-preserving sub-TBoxes (MUPS) or justifications for the unsatisfiability. We need to compute these MUPS or justifications for all unsatisfiable concepts. From these we can compute the minimal incoherencepreserving sub-TBoxes (MIPS) which are the smallest sets of axioms in the original Tbox that cause that TBox to be incoherent. To repair the incoherent TBox, we need to remove at least one axiom from each MIPS. We now define the notions in this general repairing approach formally.
The definition of MUPS is given in Def. 11. A MUPS in a consistent TBox can be seen as a justification (Def. 12) for an unsatisfiable concept. Indeed, if we instantiate ψ in Def. 12 with P ⊥ we obtain the MUPS for P . The definition of MIPS is given in Def. 13.
Definition 11 (MUPS) [32] Let T be a TBox and P be an unsatisfiable concept in T . A set of axioms T ⊆ T is a minimal unsatisfiability-preserving sub-TBox (MUPS) if P is unsatisfiable in T and P is satisfiable in every sub-TBox T T .
Definition 12 (Justification) (similar to [33]) Let T be a consistent TBox and T |= ψ. A set of axioms T ⊆ T is a justification for ψ in T if T |= ψ and ∀T T : T |= ψ Definition 13 (MIPS) [32] Let T be an incoherent TBox. A TBox T ⊆ T is a minimal incoherence-preserving sub-TBox (MIPS) if T is incoherent and every sub-TBox T T is coherent.
As mentioned, to repair the incoherent TBox, we need to remove at least one axiom from each MIPS. Essentially, this means we should compute a hitting set (Def. 14) of the set of MIPS and remove the hitting set from the TBox. In [32] these hitting sets are called pinpoints. Complexity results regarding this problem are given in [34,35,36].
Definition 14 (hitting set) ( [37]) Let T be a collection of sets. A hitting set for T is a set H ⊆ S∈T S such that ∀S ∈ T : H ∩ S = ∅.
In general, there may be several hitting sets for the set of MIPS. Different approaches use different heuristics for ranking the possible repairs.
The first tableau-based algorithm for debugging of an ontology was proposed in [10,32]. (For an overview of how a tableau-based reasoner works, see, e.g., [38].) The work was motivated by the development of the DICE (Diagnoses for Intensive Care Evaluation) terminology. A glass-box approach was used for an ALC reasoner. The branches in the tableau-based reasoner were used to compute MUPS. The MIPS were computed by taking a subset-reduction of the union of all MUPSs, where the subset-reduction of a set S of TBoxes is the smallest sub-set of S such that for all TBoxes T in S there is a TBox T' in the subset-reduction that is a subset of T [10]. Computing MUPS and MIPS for an unfoldable ALC TBox was shown to be in PSPACE, where an unfoldable Tbox is a TBox where the lefthand sides of the CGIs are atomic concepts and the right-hand-sides contain no reference (direct or indirect) to the defined atomic concept.
Computing hitting sets takes linear time for the non-minimal case while the problem is NP-complete for the minimal case [32]. This approach was implemented for unfoldable ALC TBoxes in the system MUPSter [39]. The tableau algorithm in [40] can be seen as an extension of this work. It computes maximally satisfiable sub-TBoxes and does not require individual steps for computing MUPS and applying the hitting set algorithm. Also the approach in [40] finds maximally coherent sub-Tboxes and presents an EXP-TIME algorithm for unfoldable ALC Tboxes. The DION system [39] uses a bottom-up algorithm to compute MUPS. For an unsatisfiable concept P it computes two sets of axioms Σ and S such that P is satisfiable in S, but not in Σ S. Then subsets S of Σ are computed such that P is unsatisfiable in S S . By removing redundancy from these sets we obtain MUPS. For efficiency reasons not all sets of axioms are checked, but the search is guided by a relevance function, e.g., by using only axioms that are in some way relevant to the unsatisfiable concept. In [6] a glass-box technique is used to compute MUPS (called set of support in [6]) for OWL ontologies (SHOIN ). In [33] a method was proposed to calculate all justifications of an unsatisfiable concept. Both a glass-box and black-box technique are presented for computing a single justification. The glass-box technique is an extension from [6], while the blackbox technique is based on an expansion stage where axioms are added to an ini-tially empty set until a concept becomes unsatisfiable and a shrinking step where extraneous axioms are removed. Then, given an initial justification, a black-box method computes all justifications using a variation of the hitting set tree algorithm [37]. We note that computing all justifications in inconsistent ontologies is more difficult than computing all justifications in consistent ontologies (which are the ones referred to in Def. 12), a possible reason being that defects can be dealt with one at a time in consistent ontologies, but for inconsistent ontologies the only information that we have is that the ontology is inconsistent [41]. The BEACON system [42] implements an algorithm for EL + ontologies based on a translation of the normalized Tbox (i.e., the TBox is first rewritten into a specific format) into Horn clauses and computing minimal correction subsets of the clauses, which in their turn refer to repairs in the normalized Tbox.
As there may be different ways to repair the ontologies and as computing justifications can be expensive, different heuristics and optimization approaches have been proposed (e.g., [43]). In [32] a heuristic is used stating that axioms appearing in more MIPSs are likely to be more erroneous. Therefore, axioms appearing in the most MIPSs are removed first (or, in other words, are first added to the hitting set). In [44] an arity-based heuristic is used which is similar to the heuristic in [32]. Further, [44] introduces heuristics based on the impact on the ontology when an axiom is removed and based on test cases applied by a user, e.g., by specifying desired and undesired entailments, which may be seen as an oracle that has validated certain entailments a priori. They also propose to use provenance information about the axioms as well as information about how often the elements in the axioms are used in the other axioms in the ontology to rank the axioms. Reiter's hitting set algorithm is modified to take into account the axiom rankings. The approach is implemented in a prototype for a plug-in to SWOOP. In [6] root concepts are repaired first. A root concept is an unsatisfiable concept for which a contradiction in its definition does not depend on the unsatisfiability of another concept. The other unsatisfiable concepts are then derived concepts. Repairing root concepts may automatically repair derived concepts. The idea of roots is used in [45] where the debugging is not restricted to unsatisfiable concepts, but to axioms that are unwanted. A variant of justification, called root justification, for a set of axioms U is defined as a set of axioms RJ that is a justification of at least one of the axioms in U and there is no justification of an axiom in U that is a proper subset of RJ. The authors show via experiments that the number of root justifications is usually lower than the number of justifications. The idea of roots is also used in the ORE system [46] that implements a sound (all found dependencies are correct), but incomplete (not all dependencies are found) algorithm. In [47] a notion of relevance between axioms is defined and used to guide the computation of justifications. Patterns explaining unsatisfiability are used in [48] to optimize finding MUPS.
One of the few interactive approaches in debugging of ontologies is test-driven ontology debugging. In this approach queries are generated that are classified by an oracle as true (positive) or false (negative). Repairs that do not conform to the answers are discarded. Essentially, this is a strategy to guide a domain expert through the space of possible repairs to choose the repair that eventually is executed. An important issue in this approach is how to generate the queries to the oracle (e.g., [49]). As shown in [50], there are many strategies, but none performs best for all cases. The right choice of strategy, however, is important as in some experiments the overhead for the oracle effort for the worst strategy with respect to the best strategy was over 250%. A system that implements test-driven ontology debugging is OntoDebug [51] which implements a number of earlier described methods for computing repairs, and guides the user using queries to find a final repair.
An approach based on the idea of truth maintenance systems is proposed in [52] where a set of rules is defined that are used to compute consequences from the axioms in the ontology and explanations for unsatisfiable concepts and properties. The relatively light-weight description logic that is used should be sufficient for the representation of many learned (in contrast to manually developed) ontologies.
An approach that has not been proposed earlier, but that follows naturally from the definitions and preferences of Sect. 4, is to use the oracle for the axioms in the MIPSs. For every MIPS, remove the axioms ψ such that Or(ψ) = true. If at least one of the MIPS becomes the empty set, then there is no repair unless we are willing to remove correct information. Assuming we have non-empty MIPSs after the removal of correct axioms, a hitting set would result in a repair. When redundancy is removed, we obtain a subset minimal repair. Another possibility, as we have checked the correctness using the oracle, is to use all remaining axioms in all MIPSs (as for these axioms ψ we have that Or(ψ) = false). This repair is less incorrect than the repairs obtained using hitting sets.
There are also approaches that map the debugging problem into a revision problem (e.g., [53,54,55] [54] is an interactive method where questions are asked to an oracle to decide whether an axiom is correct or not, and then consequences are computed and revision states are updated iteratively. The decision on which questions to ask is based on the computation of an axiom impact measure. In [55] a MIPS approach is used in the definition of the revision operator. In [53] the authors show how ontology debugging relates to theoretical aspects in revision and show, for instance, that axiom pinpointing is related to the problem of finding kernels in revision.

Completeness
Most of the work on completing ontologies has dealt with completing the is-a structure of ontologies. An all-knowing oracle is often assumed. Therefore, in Def. 1, ∀ ψ m ∈ M : Or(ψ p ) = true, W = ∅ and D = ∅.
There is not much work on the repairing of missing is-a structure. Most approaches just add the detected missing is-a relations. This conforms to the solution where A = M . When T ∪M is consistent and ∀ p ∈ M : Or(p) = true, we are guaranteed that M is a solution. In the case all missing is-a relations were detected in the detection phase, this is essentially all that can be done (except for removing redundancy, if so desired). If not all missing is-a relations were detected -and this is the common case -there are different ways to repair the ontology which are not all equally interesting and we can use the earlier defined preference relations.
As these approaches do not deal with correctness, Def. 3 and 6 are not used, and ⊂ D should be removed in Def. 4. In Def. 8 and 10, P = {more complete, ⊂}. In Def. 9, 'less incorrect' should be removed. In this case, the semantically maximal solutions in [56] are a special case of the maximally complete repairs where only subsumption axioms between atomic concepts are used. Further, the X-optimal and skyline-optimal repairs combine only completeness and subset minimality.
Interactive solutions to this completion problem have been proposed for taxonomies [7,57,58], for EL TBoxes [56,58] and for ALC TBoxes [59]. All algorithms compute logically correct solutions which then need to be validated for correctness according to the domain by a domain expert. It is assumed that the axioms M and A represent subsumption between atomic concepts in the ontologies. The algorithms for taxonomies and (normalized) EL TBoxes (unified notation in [58]) require that ∀ m ∈ M : Or(m) = true, and thus M is a repair. The algorithms start with a first step that computes skyline-optimal repairs with respect to { more complete, ⊂ } for each missing is-a relation. This step is different for different representation languages of the TBox. For taxonomies the algorithm 21 tries to find ways to repair a missing is-a relation P 1 P 2 by adding axioms of the form P 1 P 2 where P 1 P 1 and P 2 P 2 . For EL, additionally, is-a relations of the form ∃r.P 1 ∃r.P 2 are repaired by repairing P 1 P 2 . For EL ++ also role hierarchies and role inclusions need to be taken into account. Then the algorithms combine and modify these repairs into a single skyline-optimal repair for the whole set of missing is-a relations. Further, the algorithms repeat this process iteratively by solving new completion problems where the new M is set to the added axioms in A in the previous iteration. The union of the sets of added axioms of all iterations (with optionally removal of redundancy) is the final repair. It is shown that the skyline-optimal repairs (including the final repair if redundancy is removed) found during the iterations of the new completion problems are skyline-optimal repairs for the original completion problem that are equally or more complete than the repairs found in the first iteration. Complexity results for the existence problem (does a repair exist?), relevance problem (does a repair containing a given axiom exist?) and necessity problem (do all repairs contain a given axiom?) in general and with respect to different preferences are given for EL and EL ++ in [56,58]. In [59] an approach is proposed for ALC TBoxes by modifying a tableau-based reasoner. Repairs are found by closing leaf nodes in the completion graphs generated by trying to disprove missing is-a relations using the tableau reasoner. Open leaf nodes are closed by finding pairs of statements of the form x : P and x : ¬N and asserting then that P N . Additionally, the same technique as for taxonomies is applied.
A non-interactive solution, i.e., without validation of an oracle, that is independent of the constructors of the description logic (e.g., tested with ontologies with expressivity up to SHOIN (D)) is proposed in [60]. In contrast to the previous approaches where the repairs only contain subsumption axioms between existing concepts, this approach introduces justification patterns that can be instantiated with existing concepts or 'fresh' concepts. Further, the notion of justification pattern-based repairs is introduced which are a kind of repairs that are subsetminimal. Methods for computing all justification patterns as well as justificationbased repairs are given.

Completeness and correctness
There is very little work on dealing with both completeness and correctness. In [25,24] two versions of the RepOSE system are presented that support debugging and completing the is-a structure of ontologies (and mappings between ontologies) in an iterative and interleaving way. Wrong information is removed by cal-culating justifications and allowing a user to mark wrong is-a relations. Missing information is added using the interactive techniques in Sect. 5.2. As the system always warns the user of influences of new additions or deletions on previous changes, the system can guarantee a repair if such exists, but it does not always guarantee a skyline-optimal solution.

State of the art -ontology networks
Completing and debugging of ontology networks has received more and more attention. Similar to single ontologies, also for networks the quality is dependent on the availability of domain experts, and completely automatic systems may reduce the quality [28].
Our definitions in Sect. 4 and 5 can be used for ontology networks by creating a TBox from the network (i.e., it includes all axioms of all TBoxes from the ontologies and treats all mappings in all alignments in the network as axioms) and using this TBox in the definitions. It also follows that the techniques for single ontologies can be used for ontology networks. However, in much of the current research the axioms in the ontologies in the network and the axioms in the alignments are distinguished and treated differently.
The field of ontology alignment [27] deals with completeness of alignments (and thus only completion of the alignments, not of the ontologies in the networks). Many ontology alignment systems have been developed and overviews can be found in, e.g., [61,62,63,27,64,65,66] and at the ontology matching web site (http://www.ontologymatching.org). Usually ontology alignment systems take as input two source ontologies and output an alignment. Systems can contain a pre-processing component that, e.g., partitions the ontologies into mappable parts thereby reducing the search space for finding candidate mappings. Further, a matching component uses matchers that calculate similarities between the entities from the different source ontologies or mappable parts of the ontologies. They often implement strategies based on linguistic matching, structure-based strategies, constraint-based approaches, instance-based strategies, strategies that use auxiliary information or a combination of these. Each matcher utilizes knowledge from one or multiple sources. Candidate mappings are then determined by combining and filtering the results generated by one or more matchers. Common combination strategies are the weighted-sum and the maximum-based strategies. The most common filtering strategy is the threshold filtering. Many systems output the found candidate mappings as an alignment. This approach is mainly a detection approach and the actual repairing is to add the alignment into the network. However, it is well-known that to improve the quality user validation is necessary and several systems allow for user interaction in the different steps of the alignment including validation (see, e.g., overview in [66]). Some systems also allow the addition of partial results to influence the computation of new results and thus a repair can lead to a new detection phase. Some systems introduce other components such as recommendation for the settings for the components in the system. A system that integrates all of these is discussed in [67].
Regarding correctness, most approaches deal with mapping repair where mappings rendering the network incoherent or inconsistent are removed. Usually, the axioms in the ontologies are considered more trustworthy than the mappings and thus mappings are removed, rather than axioms in the ontologies. Although detection of defects can be different for different existing systems, justification-based techniques are often used for the repairing as in [68], and the Radon [69], AL-COMO [70], LogMap [71,72] and AgreementMakerLight [73] systems. Other heuristics than the ones in Sect. 5 could be used. For instance, the conservativity principle [74] states that the integrated ontology should not induce any change in the concept hierarchies of the input ontologies. In [70,75,73] the confidence values of the mappings are taken into account and in [68] a semantic similarity measure between concepts in the mappings is used.
Similar to the case of ontologies, some approaches for ontology networks use a revision approach, e.g., [76,77,78]. Usually, the ontologies remain the same, but the set of mappings is revised.
An approach that distinguishes between axioms in the ontologies and in the alignments, but gives equal priority to them using approaches in Sect. 5 is discussed in [24,25].

Opportunities
In this paper we have defined a framework for completing and debugging ontologies and shown the state of the art in the field. It is clear that many research opportunities still exist.

Theory and algorithms
There are still challenges regarding the development of algorithms. Many approaches have been proposed regarding correctness, but finding (preferred) repairs in an acceptable time is still an issue. The current heuristics and optimization are almost all related to logical properties. However, this does not fit non-semantic defects. Furthermore, for semantic defects solutions may be proposed that remove correct statements while there could exist repairs that only remove wrong statements. In this case involving a domain expert in the generation and validation of repairs seems to be a way forward. There is relatively little work on dealing with completeness. There is still a need for new approaches and interactive systems. The current system that allows for user interaction deals with light-weight ontologies, while the work that allows for higher expressivity is non-interactive. Even less work deals with completeness and correctness. We need work on algorithms guaranteeing different kinds of preferred repairs. The current system dealing with both completion and debugging does help a user to find a repair, but it cannot even guarantee to find skyline-optimal repairs.
From the theoretical point of view, there is quite some work on complexity results for debugging and some for completion. However, there is a need for results regarding completion and the combination of debugging and completion as well as results for all cases regarding preferred repairs. In addition to results for finding repairs, there are also the questions related to the checking of the existence of repairs, the relevance of an axiom (is there a repair containing a given axiom?) and the necessity of an axiom (do all repairs add/delete a given axiom?).
The current formalization of repair uses an oracle that replies true or false for an axiom. This means that we assume that an oracle always answers, although the answer may be correct or wrong. However, it is also possible that the oracle does not know an answer. In this case we may want to extend the formalization to allow the oracle to answer unknown. A prudent approach may not allow unknown axioms in the add set of a repair, but they could be allowed in the delete set. A credulous approach may allow unknown axioms in the add set, but not in the delete set. Further, preferences may be defined related to the use of unknown axioms.
In some cases, for instance, when there is a consensus about some concepts and their relations to each other, we may want to state that certain parts of the ontology are correct and should not be changed. This would then restrict repairing of defects to not include these parts. This can be handled by extending the formalization of the complete-debug problem by using the notion of background knowledge which represents information about the relevance or importance of parts of the ontology ( [30]). In [49] background knowledge is used to represent parts of the ontology that are asserted to be correct and therefore should not be changed in debugging. This is an example of the requirement called exploitation of context in [30].
In this survey a repair consists of set of axioms that are added and a set of axioms that are deleted from the ontology (Def. 1). However, by removing axioms sometimes correct inferences are lost. Therefore, instead of removing complete axioms we may want to weaken an axiom or rewrite an axiom such that only the parts that caused a defect are removed. Ways to formalize what a part of an axiom is in this sense, together with debugging algorithms, are presented in [79,80]. A method that rewrites the axioms in the ontology into simpler axioms and debug these is shown in [44]. In [81] parts of axioms responsible for unsatisfiability of concepts are traced. Further, lost inferences are calculated for atomic concepts. Possible axiom changes can be tagged as harmful when they do not solve unsatisfiability or introduce new unsatisfiability. The authors also introduce the notion of helpful changes where part of an axiom is removed, but other axioms are added to make up for lost inferences. An approach for dealing with defects in incoherent or inconsistent ontologies by changing axioms is axiom weakening where an axiom is replaced by another axiom that has fewer consequences subset-wise. Different ways to compute such weaker axioms are presented in [82,83]. While the current work deals with debugging, there may be cases for completion as well where we essentially would like to modify existing axioms. In principle, all these cases are already covered by the current framework as a change in an axiom can be represented by the removal of the original axiom together with the addition of the changed axiom. However, this would require solutions for the full problem in Def. 1 (which, as we have seen, are not available yet) and we may want to introduce new preference relations based on additions and deletions that reflect axiom changes.
In this survey we have focused on ontologies that do not contain instances. When instances are available these could be used in detection or repairing [84,46,85,86]. For instance, using instances is one of the main ways to detect inconsistent ontologies.

User support
From a practical point of view it is clear that to obtain the best results of the completion and debugging, domain experts need to be involved. However, the support of the current tools regarding user involvement is still lacking. For instance, one of the outcomes in a study with users on ontology authoring is that debugging is difficult [87]. Although ontology development systems may have explanation facilities, debugging is still cumbersome and ontology developers use their own strategies, such as running a reasoner frequently when adding new axioms to the ontology to detect possible defects. The authors also state that, although good work has been done in ontology debugging, not that much has been integrated in ontology development tools. According to the study SWOOP had good debugging support, Protégé had explanation facilities, WebProtégé and the free edition of TopBraid did not have such support.
There are thus several opportunities for further work on user involvement in completion and debugging systems. One way forward is to consult the guidelines for ontology alignment systems that are also valid for the more general completion and debugging systems. Recommendations for user support for ontology alignment systems regarding user interfaces (partly based on [88]) as well as infrastructure and algorithms are given in [89]. The former include support for manipulation, inspection and explanation of mappings, while the latter include, among others, support for sessions, reduction of user interventions, collaboration, recommendations by the system, system configuration, debugging, trial execution and temporary decisions. In [90] the authors focus more specifically on user validation in ontology alignment and discuss issues related to the profile of the user, the services of the alignment system, and its user interface. Recommendations for tool support for each aspect are given and the support in current systems is presented. For the user profile the authors discuss user expertise in terms of domain, technology and alignment systems. For services they discuss stage of involvement, feedback demand and feedback propagation. Finally, for the user interface issues regarding visualization and interaction are discussed. The paper reports that while there have been significant advances on the part of alignment systems in these areas, there are still key challenges to overcome such as reducing user workload, balancing informativeness with cognitive load and balancing user workload with user errors. More generally, the field of visualization and interaction for ontologies and Linked Data has received more and more attention during the recent years and in [91] the issues of cognitive support, user profiles and visual exploratory analysis are briefly discussed.
There are also some specific problems that have been noted in the debugging and completion area.
An important issue is that domain experts make mistakes and thus the oracle makes mistakes (see third case in Sect. 4.1.2). This has been reported often, and for instance, the study in [92] states that questions to the oracle about statements that are true receive more reliable answers, that domain experts are sometimes overconfident, and that they consider themselves as imperfect. Some research has discussed approaches to deal with this issue. For instance, in [93] an approach (for ontologies with instances) is presented where the domain expert can request the history of the given answers, correct wrongly given answers and continue, while in [92] a prediction model is developed for predicting oracle errors. The impact of oracle errors on the effectiveness and efficiency of ontology alignment systems are shown in [90] as assessed in the Interactive Anatomy track of the Ontology Alignment Evaluation Initiative 2015 -2018.
Another issue that has been mentioned is that some algorithms work on normalized versions of a TBox and therefore the results may not be that intuitive in terms of the original ontology. This requirement is called preservation of structure in [30].
Further, in [93] it was reported that domain experts would want to be able to sometimes postpone their answers as an oracle, e.g., to have the time to check up some information or reflect more deeply.
More generally, completion and debugging should be integrated in every ontology development methodology, such that developers can detect and repair defects as soon as possible. One of the few that has different steps regarding completion and debugging within the general framework is an extension of the eXtreme Design Methodology [94].

Conclusion
As semantically-enabled applications require high-quality ontologies, developing and maintaining as correct and complete as possible ontologies is an important, although difficult task in ontology engineering. A key step for guaranteeing a certain level of correctness and completeness is ontology debugging and completion.
In this survey paper we have reviewed the state of the art in ontology debugging and completion where we have focused on the repairing step. We have done this by introducing a formalization for the completion and debugging problem which allowed us to review and discuss the state-of-the-art in this field in a uniform way. Using this formalization we show that, traditionally, debugging and completion have been tackled separately, we compared different approaches, and gained insights in the field and we point to new opportunities for further research to advance the field. Note that for an oracle that does not make mistakes, if Or(P Q) = true, then also Or(∃r.P ∃r.Q)=true. and if Or(P Q)=true, then also Or(P O Q)=true. For other axioms P Q with P, Q ∈ C, Or(P Q) = false. GranulomaProcess } and this is not available anymore after the repair as Patho-logicalProcess InflammationProcess is removed. The newly added axioms do not give rise to new justifications for PathologicalProcess GranulomaProcess.       Thus, R ax = R2 ax = R3 ax R1 ax R7 ax R8 ax and R ax = R2 ax = R3 ax R4 ax R5 ax R6 ax R8 ax . From this we conclude that R8 is more complete than R7 which is more complete than R1 which is more complete than R2 and R3. Further, R8 is more complete than R6 which is more complete than R5 which is more complete than R4 which is more complete then R2 and R3.

Subset
For the add and delete sets for the repairs R1-R8 we obtain the following: A 1 = A 2 = A 3 , A 7 A 8 , D 1 D 3 = D 4 = D 5 = D 6 = D 7 = D 8 , and D 2 D 3 = D 4 = D 5 = D 6 = D 7 = D 8 . Therefore, R1 ⊂ R3, R2 ⊂ R3 and R7 ⊂ R8. R3 deletes more wrong information than R1 and R2, respectively, but for repairing this is redundant (although R3 is less incorrect than R1 and R2). R8 adds additional correct axioms compared to R7, but this is redundant for the repairing (although R8 is more complete than R7).

Preferred repairs
R8 is maximally complete as the ontology repaired by R8 contains all correct information.
R3, R4, R5, R6, R7 and R8 are minimally incorrect as the ontologies repaired by any of these repairs do not contain incorrect axioms.
R1 and R2 are subset minimal as if we remove an axiom from the add or delete set we would not have a repair anymore. None of the other repairs is subset minimal as there are always variants where we can remove one of the axioms in the delete sets and still have repairs.

Combined preferences
According to the definition, only preferred repairs with respect to a preference X can be X-optimal.
Regarding more complete-optimal the only candidate among our example repairs is R8. R8 is more complete-optimal with respect to { less incorrect } as the ontology repaired by R8 does not contain wrong information. It is not more complete-optimal with respect to { ⊂ } as there are repairs that are also preferred with respect to more complete, but that remove one fewer wrong axiom. However, R8 is more complete-optimal with respect to { less incorrect, ⊂ }. If there would be a preferred repair with respect to more complete that dominates R8 with respect to { less incorrect, ⊂ }, then it would have to be more preferred to R8 with respect to ⊂ as it cannot be more preferred with respect to less incorrect. However, removing an added axiom from R8 would make the repair not preferred with respect to more complete and removing fewer deleted axioms would make the repair less incorrect than R8.
Regarding less incorrect-optimal the candidates among our repairs are R3, R4, R5, R6, R7 and R8. R8 is less incorrect-optimal with respect to { more complete }. Similar reasoning as above leads to the fact that R8 is less incorrect-optimal with respect to { more complete, ⊂ }. As R8 dominates R3, R4, R5, R6 and R7 with respect to more complete, R3, R4, R5, R6 and R7 cannot be less incorrectoptimal with respect to { more complete }. R7 dominates R8 with respect to { ⊂ }, so R8 cannot be less incorrect-optimal with respect to { ⊂ }. A repair that would be preferred with respect to less incorrect and dominate R3, R4, R5, R6 or R7 with respect to { ⊂ } would need to remove the two wrong axioms (to be preferred with respect to less incorrect) and would therefore need to add fewer (subset-wise) axioms. However, removing one of the added axioms in R3, R4, R5, R6 and R7 would lead to sets of axioms that are not a repair. Therefore, R3, R4, R5, R6 and R7 are less incorrect-optimal with respect to { ⊂ }.
Regarding ⊂-optimal the candidates are R1 and R2. R1 is more complete than R2, so R2 cannot be ⊂-optimal with respect to { more complete }. Also R1 is not ⊂-optimal with respect to { more complete } as there is another subset minimal solution that dominates R1 with respect to { more complete } (e.g. same delete set as R1, but Carditis CardioVascularDisease in the add set instead of Endocarditis PathologicalPhenomenon). For similar reasons R1 and R2 are not ⊂-optimal with respect to { more complete, less incorrect }. They are, however, ⊂-optimal with respect to { less incorrect }. A less incorrect repair than R1 or R2 would need to remove both wrong axioms, but would then not be subset minimal.
We note that all preferred repairs are skyline optimal. Further, if a repair is Xoptimal with respect to P, then it is skyline-optimal with respect to P X. Thus, R8 is skyline-optimal with respect to { more complete, less incorrect } and { more complete, less incorrect, ⊂ }. R1, R2, R3, R4, R5, R6 and R7 are skyline-optimal with respect to { less incorrect, ⊂ }.
In addition there are also skyline-optimal repairs that are not X-optimal. For instance, R1 is not more-complete-optimal with respect to { ⊂ } nor ⊂-optimal with respect to { more complete }. However, R1 is skyline-optimal with respect to { more complete, ⊂ }. If there would be a repair that dominates R1 with respect to { more complete, ⊂ }, then there are two possibilities. The first possibility is that the other repair is more preferred with respect to ⊂, which would mean taking away an axiom in the add set or in the delete set, but then we do not have a repair. The second possibility is that the other repair is more preferred with respect to more complete and equally preferred with respect to ⊂. The second condition would only be satisfied if the add and delete sets are the same, but then we have the same repair.