A Human Information Processing Theory of the Interpretation of Visualizations: Demonstrating Its Utility

Providing an approach to model the memory structures that humans build as they use visualizations could be useful for researchers, designers and educators in the field of information visualization. Cheng and colleagues formulated Representation Interpretive Structure Theory (RIST) for that purpose. RIST adopts a human information processing perspective in order to address the immediate, short timescale, cognitive load likely to be experienced by visualization users. RIST is operationalized in a graphical modeling notation and browser-based editor. This paper demonstrates the utility of RIST by showing that (a): RIST models are compatible with established empirical and computational cognitive findings about differences in human performance on alternative representations; (b) they can encompass existing explanations from the literature; and, (c) they provide new explanations about causes of those performance differences.


INTRODUCTION
It is well acknowledged that understanding users' mental models is important for interface design (e.g., [4], [38], [15], [3], [31]).This is also surely true for information visualization.However, studies of users' interpretation of visualizations, and graphical representations more generally, and methods to study such interpretations have been limited compared to other fields.In the cognitive science of representations, investigations of mental representations of diagrams and notations have featured prominently, see [17] for examples and [20] for a review.To motivate the potential importance of studying the nature of interpretation to information visualization, consider the preeminent field that is concerned with understanding how particular representations work, specifically linguistics [14,21,43].Studies in linguistics address how the syntax, semantics and pragmatic components of text and talk impact the elementary communication of information but also how they influence, often reciprocally, sophisticated functions such as narrative, argumentation, metaphor, and style.Explanations grounded in these components and functions have led linguists to insights about how the use of natural language can both succeed and fail.
Although interpretation has not been a distinct topic in information visualization, much work on visualization and representation design has been conducted.For catalogues of visualizations see [19,22,[40][41][42].Many sets of guidelines for visualization design have been proposed [2,5,9,11,23,24,41,44].Cognitive, human information processing accounts of the nature of visualizations, have been proposed, including: [20,25,26,28,35,45,46].These have included computational cognitive models.Larkin and Simon's [26] models explained how diagrams can confer search and recognition benefits in problem solving (see Section 4).Peebles and Cheng [29] built models that simulated different eye movements patterns that underpin strategic differences that arise form alternative line graphs and produce significant variations in performance.With regard to interpretation, some accounts have focused specifically on the interpretation of graphs and charts, [6,16,30,36].The latter is notable as it draws inspiration from cognitive linguistic ideas about comprehension [21].However, research on users' interpretations of visualization is yet to mature.
This paper focuses on theory of interpretation proposed by Cheng and colleagues [10,13,37] -Representational Interpretive Structure Theory, RIST.The theory is a cognitive account from a human information processing perspective of the mental structures that users construct as they interact with visualizations.The theory is accompanied by a graphical notation in which models of specific interpretations of a representation used for a particular task can be modeled -RIS-Notation, RISN [10,13].The browser-based tool RIS-Editor, RISE, has also been constructed for the easy construction of models in RISN that conform to the theoretical constraints of RIST [13,37].Diverse models of visualizations and other notations have been built, including: bar charts; simple and complex line graphs; algebraic formulas; multi-level pie charts.All this suggests that the approach has some general utility.
The aim of this paper is to rigorously demonstrate the utility of the approach by showing how designers and researchers can obtain predictions of the likely cognitive demands of informationally equivalent [26] alternative representations.As the purpose of RISN is to model the memory structure of an instance of a user's visualization, that is an interpretation, it does not provide direct absolute measures of task performance, such as error rates or completion times.However, the comparison of the complexity of the RISN models for competing representations does provide a systematic basis for judging the relative cognitive processing demands of the representations.Thus, our approach here focuses on comparing RIST models with prior empirical and computational studies that use alternative visualizations, or representations, on the same task.Such studies show that substantial differences in user performance can be produced by alternative representations.
We pose and answer three questions to demonstrate the utility of RIST.
(1) Does the comparison of the likely information processing impact of the alternative interpretative structures in RIST for each representation match the performance differences in the results of the empirical tests?
(2) How well do RIST's explanations of the relative efficacy of the alternative visualizations conform to the original explanations provided by the authors of those studies?
(3) Does RIST provide additional insights into the reasons for the difference in performance between the alternative representations?Positive answers to the questions will provide some evidence for the utility of RIST.
We will conduct four such case studies to evaluate RIST.These are presented in Sections 3, 4 and 5 (two in detail and two in outline).First, RIST, RISN and RISE are introduced in the next section.

REPRESENTATIONAL INTERPRETIVE STRUCTURE THEORY, NOTATION AND EDITOR
Representational Interpretive Structure Theory, RIST , was originally developed by Cheng [10] for the study of static representations, such as diagrams and notations, including information visualizations.Cheng and colleagues [12,37] provide a two-part operationalization of RIST: (a) RISNotation is a graphical language for RIST; (b) RISEditor is a web-browser editing tool for building models in RISN.A key goal in the development of RISN and RISE was to directly embody the core cognitive claims of RIST, so that models of representations produced by researchers, designers and educators would be cognitively plausible.Thus, the expectation is that analysts will be able to gauge the cognitive properties [11] of an interpretation of a representation by examining the conciseness and coherence of its models.

RIST and RISN
This section introduces the RIST approach by exemplifying an interpretation of a display of date and time as found on some digital devices.Figure 1A shows the display (black symbols), with blue annotations of the graphic objects that will be referenced by the RISN model.Figure 1B is a RISN model of that and similar displays.RIST defines the nature of users' mental memory structures that constitute their interpretation of representations.Compared to previous accounts of the nature of representation use, RIST gives equal status to (i) the objects in the external display that graphically represent concepts and (ii) the internal memory structures that mentally encode concepts.Previous accounts have tended to emphasize one or the other.For example, [19], [44], and chapter 1 of [5], focus on how to structure external visualization to encode the given data.[28], [24] and [30] primarily focus upon the nature of the internal memory structures.Although Zhang & Norman [45,46] take a distributed cognition stance on representations they nevertheless conceptualize concepts hierarchically with connections to external graphic objects (symbols) as leaves.In contrast, RIST claims that the process of interpreting a representation creates connections between the internal mental representations and the graphical contents of the external display at any level of abstraction and generality relevant to the target domain.The user does not merely associate atomic symbols in the external display with elementary domain concepts.Higher order concepts are not just conjunctions of elements.Rather, RIST asserts that direct associations exist between higher-order concepts and configurations of graphical objects in the external display.Thus, the interpretive structure of external displays cannot simply be understood as conjunctions of atomic graphical symbols but must be modeled as rich multi-level structures that are mapped to higher-order concepts.For example, the annotations of graphic objects of the date and time display in Figure 1A include elementary symbols (blue annotations: MN, DD, HH, MT, "|") and higher-order structures (DAT and TIM are obvious, and SCL denotes a scale of times).RIST hypothesizes four core theoretical notions to encapsulate the core claim that graphic objects and memory structures are tightly associated at multiple levels of granularity.
(1) A set of elementary (atomic) memory components -schemas -encode the information associated with an interpretation.Schema theory is well-established in cognitive science [1,27,33,34,39].In modern schema theory, a schema (or frame) is a structure that encodes a general concept (context) by providing slots (variables) for the properties of the concept.Fillers are property values held in the slots.A particular combination of fillers defines a specific instance of the general concept.RIST encodes the tight association between mental concepts and graphic objects by providing slots in all RIST's schemas that (i) store information about the domain concepts being represented, and (ii) hold codes (annotations) for the graphic object that represents the concept.In RIST a graphic object is any graphic entity including basic visuospatial properties and relations, icons and glyphs, but also complex configurations of such things.(2) RIST posits four core types of schemas: Representation, Rscheme, R-dimension and R-symbol.RISN is the graphical language for building models in RIST [10,13].To introduce the schemas and RISN, Figure 1B is a RISN model for the date and time display shown in Figure 1A.Each of the shapes represents a particular type of schema, with (i) names of the domain concepts at the top (e.g., Timescale in the schema labelled R-dimension) and (ii) pointers to the graphic objects in the display at the bottom (e.g., SCL in the same schema).The Representation schema -capsule shape -defines a domain of interest, Date-time display, and identifies examples of the display.R-symbols -rounded rectangles -encode fixed value concepts and the graphic objects representing them, such as the current time, Now, or the turn of a New year.Class R-symbols -dashed rounded rectangles -are used to represent multiple Rsymbols that are not explicitly enumerated in a model.For example, the Others R-symbol acknowledges the existence of other dates and times and that they are represented by alphanumeric strings.R-dimensions -trapeziums -encode concepts that are variables or classes plus the graphic structures that allow values of the concept to be depicted.Our example includes R-dimensions for Day, Month, Hour and Minute, plus three others.The graphic objects corresponding to the concepts are identified by letter codes from the blue annotations in Figure 1A, respectively, DD, MN, HH and MT.R-dimensions capture the quantity scales for both the concept and the graphic object, which are on the right of the R-dimension icons and denoted by letters for Nominal (N), Ordinal (O), Interval (I) and Ratio (R) scales.RISN makes the nature of the quantities explicit because the types of a quantities determine the forms of the reasoning that are permissible with the concepts and graphic objects of the domain.Month and Day are sub-R-dimensions of the superordinate Date R-dimension, which captures the notion that Month and Day are quantities of the same type relating to dates.
The fourth and final class of schemas are R-schemes -rectangles.They are meaningful conceptual and graphical structures that are built from the other components of RIST. Figure 1 has two R-schemes.The Time R-scheme defines time as a conjunction of the Hour and Minute R-dimensions, with the colon (":") R-symbol differentiating the two parts.Date is an R-dimension because it only has sub-R-dimensions as components.Time is an R-scheme because it mixes R-dimensions with an R-symbol.The Date and time R-scheme encodes the overarching structure of the display.It comprises a Divider ("|") R-symbol, the Date R-dimension, the Time R-scheme and a Timescale R-dimension.The Timescale Rdimension encodes the four orders of magnitude of times and dates that the horizontal ordering in this particular display preserves (Figure 1A, SCL arrow; Figure 1B, the numbering of the connectors).This coherent alignment of magnitudes with spatial organization is a feature of this design.The Timescale R-dimension would be omitted from RISN models for other designs.
(3) An interpretation of a representation is a network of these schemas, which are linked by two types of connectors: hierarchy and anchor.Hierarchy connectors associate two schemas with a parent-child or whole-part relation, where some notion of the higher concept is inherited by the lower.In the bottom row in Figure 1B, the Now, New Year and Others R-symbol are all combinations of values contributed by the Month, Day, Hour and Minute R-dimensions above.In turn those four R-dimensions inherit concepts from the Date R-dimension or the Time R-scheme above.And recursively those schemas inherit notations about the overall location in the display from the Date and time R-scheme.No anchor connectors are present in the model.Anchor connectors capture associations between schemas in which the child schema establishes a context for new concepts that are not subsumed by the parent schema (examples will be seen below).
Certain combinations of connections between the types of schemas are valid according to RIST.Some examples include: the leaves of a network must be R-symbols or Representation schemas, because they are, or may be treated as, tokens; transitive subnetworks should be avoided, because they violate the assumption of hierarchical structure; an R-dimension can only be a child of an R-symbol if it is an anchor connection, which is a non-hierarchy connection that introduces an new interpretive sub-context.
(4) The last idea is Idioms [37].Idioms are certain substructures of RISN models that are common across different domains and representations.The idioms serve specific representational functions, such as: filtering or specializing concepts through a sequence of R-dimensions; indexing by combining concepts in a combinatoric manner by selecting values from multiple R-dimensions.For example, Figure 1B has two implicit coordinate system idioms in which an R-scheme has two or more R-dimensions that form a space in which unique combinations of values are defined (e.g., a Date is a Month plus a Day; a Time is an Hour and a Minute).Idioms provide a useful intermediate level of detail between whole representations and the atomic schemas that allow comparisons to be made between models even when they encode different conceptual and graphical content.
It should be noted that each RISN model is a single interpretation relative to the presumed context.Alternative interpretations can arise for multiple reasons [37], including: users have different levels of understanding of the target domain; users vary in their degrees of familiarity with, or visual literacy of, the target visualization; interpretations are a function of the goals of the task being tackled with the given visualization.When building the RISN model in Figure 1B we assumed a user with a full and correct interpretation of the display.A model for someone who erroneously thinks that "06:10" is six minutes past ten would have the minute in the HH graphic object schema, hour in the MT schema, and the location of those schemas swapped.The right-hand side of the model would be absent for a user who is merely looking up the date and not attending to the time.The methodological implications of this for comparison of model for alternative visualizations is considered in Section 3.

RISE
Cheng & colleagues [37] have provided the RISE web browser-based editor for building models of interpretations in RISN that conform to the theoretical constraints of RIST (http://users.sussex.ac.uk/∼peterch/RIST/RISE/).RISE has functionality that aims to lessen the GUI design cognitive hiatus by monitoring the syntactic correctness of the model being built.For example, RISE gives alerts when the rules for connecting types of schemas are violated.RISE also has routines that continuously monitor the structure of models in order to identify some idioms.

Visualization Efficacy Assessment Criteria
RIST does not specify how RISN models might be compared with data on the effectiveness of representations.Thus, in this paper we propose a method for visualization efficacy assessment that focusses on the information processing demands of using a visualization.A RISN model of a user's interpretation of the visualization is a claim about the memory structures that the user constructs whilst using the interpretation, which will involve many heterogenous perceptual and cognitive processes.Therefore, it is not currently feasible to derive predictions about performance from the structure of a model.However, as larger and more complex models clearly will require more processing to navigate or to construct, we can compare the extent and complexity of alternative models to judge which is likely to have the greater mental information processing demands.We contend that one visualization will be more cognitively demanding than another informationally equivalent visualization, when at least one of these comparisons is true, all else being equal: 1.The model has a greater number of schemas.2. The breadth of the network of schemas is greater.3. The depth of the network, from the root to the leaves, is greater.4. The complexity of links between schemas is greater.5.The overall structure of the network is less homogeneous 6.More R-dimensions have concept and graphic object quantity scale that are not compatible.
If a model for representation A is better on some selection of these assessment criteria than a model for representation B, and they are comparable on the other properties, then representation A should have lower information processing demands.A clear differentiation of efficacy will be obtained when on model is exclusively better on some subset of these criteria and both models are similar on the remaining criteria.In a case where the superiority of criteria is distributed across the visualizations, then no overall claim about their relative efficacy can be made.This has implications for the selection of pairs of visualizations for comparison, as considered next.

METHOD
The purpose of this study is to demonstrate the utility of RIST by creating RISN models for alternative visualizations where prior studies have shown that the visualizations, used on the same task, differentially impact the levels of performance of users.Will the predictions and explanations derived from RISN models match the empirical findings?Two case studies are presented in detail: (a) Sankey diagrams versus Chord diagrams [18]; (b) Larkin and Simon's [26] pulley problem.Two others are outlined in Section 6.This section addressed two methodological questions.(1) What constitutes suitable prior studies of alternative representations?(2) How should our study be conducted to ensure fair, unbiased comparison between the RISN models?
The first question is an issue because the literature includes many claims about the superiority of one class of visualization compared to another.For example, it is often asserted that bar charts should be used for interpretations of distinct values of data whereas line graphs are preferable for trend interpretations.However, line graphs display distinct values (i.e., the datapoints), and a succession of bars in a bar chart conveys a trend, so not surprisingly empirical studies yield small effect sizes although statistically significant (e.g., [36]).For the present study we require prior studies with unambiguous and robust findings that show that one representation is clearly superior to another.Such superiority may take three forms: (a) the effect holds consistently across different task types, different complexities of the two classes, and several different dependent measures of user performance -the study of Sankey/Chord [18] has this form; (b) the effect size is substantial across the two representations (e.g., by a factor of two) -the study of representations of the pulley problem [25] has this form; (c) the effect size is not substantial but additional evidence is provided, such as computational cognitive model or a detailed task analysis -we give two cases in Section 6.
Of course, RISN models can be built for representations where the findings are less definitive, such as [36], with the expectation that performance predictions of model would be small.But such a null result from the comparison of the models and findings provides little basis for judging the value of the RIST approach, because the absence of a difference could logically be attributed equally well either to the failure of the approach or to its success.Hence, the focus here is on studies with clear cut findings.
The critical aspect with respect to the fairness of comparison resides with the modeling.We must not unwittingly bias the construction of the models knowing which visualization is empirically superior.Three approaches are taken here to reduce this likelihood.In the first approach, we adapt the four-stage method for constructing models provided by [36] to constrain the process of modeling.In the first stage we specify a common neutral context of user for the visualizations, so there is no possibility that a task specifically envisaged for one visualization is used.In the second stage, a description of the contents of the interpretation is written to delimit the range of concepts to be modeled ranging prior to initiating modeling, in order to reduce the chance of introducing concepts tailored to just one of the visualizations.The third stage is to annotate all the task-relevant and arbitrary graphical objects in each visualization, again to reduce the chance of tailoring them to just one of the visualizations.The fourth phase is to combine bottom-up and top-down analysis strategies, by introducing R-symbols for the concrete lower level in parallel with R-schemes and R-dimension for the most abstract concepts, so as not to favor visualizations that are more concrete or abstract.
The second approach is to consider models of alternative interpretations of a visualization in order to guard against inadvertently focusing on just one unrepresentative interpretation.As noted above (at the end of Section 2.1), the goals of the task within which a representation is being used will affect what is salient and hence would be included in a RISN model.Therefore, a modeler should at minimum make explicit those task assumptions but, ideally, they ought to build alternative models for other representative task goals and conduct the analysis across those models.
The third approach is for the modeler to be quite explicit about the level of expertise of the target users for whom the models are being built.The assumed level of their experience in the domain being represented, and of their visual literacy of visualizations being compared, should be similar for the sake of fairness.
Here the modeling assumes typical competent users, who have a good knowledge of the topic and who have good literacy in the visualizations.The presumed task goals are those that were given to the participants in the adopted studies, and no special sources of cognitive load imposition or reduction are introduced.
We now present the two detailed case studies.

SANKEY VERSUS CHORD DIAGRAMS
For the first case study, consider a comparison of two visualizations published in the 2023 proceedings of the CHI conference.

Empirical evidence
Gutwin and colleagues [18] compared reasoning with Sankey diagrams and Chord diagrams.Figures 2 and 3 show examples of these diagrams for the same (imaginary) data concerning the transfer of resources between various companies, which are representative of the material that [18] used.Companies are both sources and recipients of resources from each other.The transfers, ribbons, are either small or large.In the Sankey diagram the companies on the left are sources of resources and on the right they are recipients.The study by Gutwin [18] compared Sankey and Chord diagrams for four sets of data of various complexity, collected from 51 useable participants.The participants used the diagrams to answer questions in five classes: existence, find element, compare magnitude, minimum / maximum, count links.The results were clear cut: the Sankey diagrams were superior.Participants took substantially longer to answer questions and made more errors with Chord diagrams than Sankey diagrams.Subjectively, participants also rated Chord as more effortful, and they preferred Sankey diagrams.Gutwin's explanations of the difference in performance are given in terms of the visual and spatial properties of the two visualization formats [18].The left to right organization of the Sankey diagram is clearer than the "disorganized" structure of the Chord diagram.The circular organization of the Chord diagram may have required more checking.The links in the Chord diagram appear to be harder for participants to follow for various reasons, such as the arrow heads were indistinct and the thinning of the middle of ribbons may make them hard to trace.
Will RISN models of Sankey and Chord diagram concur with these empirical finds and might they provide additional explanations of the differences in participants' performance?

RISN Sankey model
Figure 4 is a model of the Sankey diagram.At the top is the Representation schema for the whole visualization (Figure 4, coordinate A12).The companies, their roles as sources or recipients, the transfers, and the levels of transfer are all quite visually prominent.Thus, the model considers the visualization to be comprised of four overarching related classes of concepts that are each encoded as an R-dimension.(1) There is a set of Companies which is encoded as an R-dimension (D9) with nominal quantity scales for both the concept and the graphic object.The companies are encoded as R-symbols below (F5-10).( 2) The Role R-dimension (D6), with an ordinal scale, uses position as its graphic objects.It has two R-symbols (F2, F12) for the source and recipient roles in the left and right of the Sankey diagram.(3) The amount of transfer is encoded by an R-dimension (D16) with an ordinal scale and R-symbols (H15, H17) referring to ribbon width.(4) The last class of concepts are the transfers themselves; an R-dimension (I18) with a nominal whose graphic objects are the ribbons.The association of the four R-dimensions is encoded by the R-scheme (B12) as an implicit coordinate scheme idiom.The coordinate scheme has two functions.First, to identify companies serving particular roles, R-symbols for sources (H1-6) and for recipients (H8-13).Second, it gives each transfer (P1-19) a unique identity and magnitude by inheriting the values from the R-symbols for companies-role (H1-13) and the transfer quantity (H15, H17).Although not a trivial network, it has a coherent overall structure.

RISN Chord model
The appearance of the Chord diagram, Figure 3, emphasizes companies and the transfers among them.The direction of transfers feature less prominently as the bases and heads of arrows are mixed together within each segment, and they are not consistently grouped together.The width of the arrow heads and tails are obscured by their difference in shape, and the narrowing of the middle of the ribbons is also a distraction.The RISN model of the Chord diagram, Figure 5, reflects these observations.The overall structure is an R-scheme (B11) comprising a Companies R-dimension (D6) and a Transfers R-scheme for complex concepts (D15).The Companies R-dimension has text labels, color and arc segments of the circle's circumference as graphic objects.As with the Sankey diagram, the companies are R-symbols (F4-10).The Transfers R-scheme has three R-dimensions.The Direction R-dimension (F13) encodes the direction of transfer using two R-symbols (H12, H13) for the start and end represented by arrow tail and head shapes.The Quantity R-dimension (F15) encodes the amount of transfer as ribbon widths.It has three R-symbols; two (K14, K16) are the small and large amounts, and the other is a class R-symbol (K16) which encodes all the other possible widths of ribbons.Each company/circle segment has its own local group of arrow heads and tails, so each company R-symbol has a local anchored Role R-dimension (H4-10).Each Role R-dimension, in turn, has sub-R-dimensions for Sources and Receipts (K3-11), whose R-symbols are the individual source or recipient roles of a company (O1-16).For the sake of clarity, a place holder symbol (O9) is used to represent some of these R-symbols.Transfers are composed of pairs of these R-symbols, one example is given for the transfer from Under Armour to Adidas (ribbon label    3).This transfer inherits the concept of a small quantity of transfer (K14) via the company-role R-symbols of those companies (O5, O15).The link between the R-symbol (K16) for different widths of ribbons runs directly to the transfer R-symbol (Q18), because it is not associated with a company role or the tail or head of a ribbon.Again, for the sake of clarity, a class R-symbol represents all of the other transfers/ribbons R-symbols (Q18).
The ditto hexagons (G6-8, K6-8) are place holders for parts of the RISN network that are not explicitly shown.For example, the hexagons at coordinates H6 and K6 represents the three Rdimensions like their immediate neighbors to the left.Without these place holders, and also the use of the class R-symbols, the network will be far larger and include many more crossing links.

Discussion
What are the implications of the RISN model network structures in Figures 4 and 5 for the usability of the Sankey and Chord diagrams in Figures 2 and 3? Consider the extent and form of the networks.Visualization efficacy criteria are listed in Section 2.3 above.Table 1 summarizes the network properties for the two visualizations that relate to the criteria with the bold indicating the superior visualization.The approximate size of the networks as drawn is comparable; the Chord model has 38 schemas and the Sankey has 40.However, the Chord model includes place holders and class R-symbols that represent 44 hidden R-dimensions or R-symbols, so it has over double the number schemas for a user to deal with.This will likely impact the relative effort required to search each diagram for symbols irrespective of the specific format of the visualization and will impact the cognitive load of distinguishing concepts in, or retrieving concepts from, memory.Transfer/ribbon R-symbols are leaves in both models.The longest paths in the networks between the root and leaves also differs; the Sankey has a maximum of six nodes whereas the Chord has eight.This may impact the relative ease of finding a ribbon in the diagrams or focusing conceptually on a transfer for a particular source and recipient pair.The greatest width of the Sankey model is 15 nodes at the leaf level (Figure 4, P1-19).The Chord model also has 15 nodes at its leaf level, but the layer above includes 30 nodes for all the source/recipient roles or ribbon heads/tails.Again, this implies the Chord diagrams will be more perceptually and inferentially challenging than the Sankey diagram.In terms of the form of the networks, the Sankey model is more coherent, with relatively fewer links that jump more than one level.Overall, the Sankey model is one large implicit coordinate system with four R-dimensions whose individual scopes cover the whole representation.All of its R-dimensions contribute properties to the leaf transfers/ribbons R-symbol.The Chord model includes nested sub-R-dimensions and local R-dimension that are anchored under specific R-symbols (each circle segment contains its own unique collection of sources and recipients).In schema theoretic terms, as each standard (non-anchored) link in a RISN model is a conceptual association, the concepts associated with an R-dimension are inherited directly by their R-symbols.This means that Sankey diagrams may be more compatible with our minds' apparent method of processing the hierarchies of (categorical) concepts [32], in which attributes of superordinate concepts are inherited by their subordinate sub-concepts.In other words, inferences with Sankey diagrams may be better able to exploit such inherent automatic cognitive processing whereas Chord diagrams may require more effortful deliberate conscious cognitive processing.Overall, the extent and form of the network structures suggest that Sankey diagrams will be easier to interpret than Chord diagrams, which will impact task performance.
The claim that Sankey diagrams are superior to Chord diagrams is consistent with Gutwin's overall finding [18].However, the explanations provided by RIST and Gutwin [18] have different characters.[18] explanations are largely based on differences between the visual features of the Sankey and Chord diagrams and the complexity of the information encoding.As the schemas proposed by RIST contain information about graphic objects, RISN models can also be used to make explanations on that level.However, the particular value of RISN models is to elucidate the mental conceptual structures that constitute an interpretation, so the comparisons between the two representations were made in terms of the extent and form of their networks, from which performance implications could be drawn.
The modeling of a visualization depends on the analyst's conception of the user's interpretation, so Figure 5 is not the only feasible model of a correct interpretation of the Chord diagram.(See Section 7 for a discussion of interpretive correctness.)A plausible alternative is to consider the heads and tails of the ribbons as leaf R-symbols, rather than ribbons themselves: imagine an instructor describes heads and tails as properties of ribbons rather than a ribbon being composed of a head and a tail.The network of the model for such an interpretation is just as broad as Figure 5, but shallower (5 links).However, it includes 20 anchored R-dimensions rather than the seven in Figure 5, because each transfer is now its own context for identifying source and recipient.This alternative interpretation is also likely to be poorer than the one for the Sankey diagram that has no anchored relations between any of its schemas.

DIAGRAMMATIC VERSUS SENTENTIAL REPRESENTATIONS
For the second case study, consider the seminal work on the cognitive science of representational systems by Larkin & Simon [26] who explain why diagrams are (sometimes) superior to sentential notations.

Computational and empirical evidence
The study by [26] built cognitive models to examine information processing differences between diagrammatic representations and sentential representations of the same data.They analyzed two examples.We will consider their first case that concerned a mechanics problem with a pulley system.Figure 6 shows a pulley problem system that is functionally equivalent to theirs.The question is: given that weight W1 is one unit what is weight W2? Figure 7 is a sentential representation that encodes all of the information that is in Figure 6.In Larkin & Simon's [26] terms they are informationally equivalent representations because all the inferences that can be made in one can also be made in the other.They built production system (rule-based) models of the problem-solving process using the two representations.The model has inference rules, for instance: if a rope hangs over or under a pulley and the tension on one side is <X>, the tension on the other side is also <X>; for example, in Figure 6, if tension in rope p is 1, then tension in rope q is 1 also.The inference rules for both the diagrammatic and sentential models are equivalent.The models only differ in the indexing schemes they use to encode the information in their respective representations.The sentential model stores the sentences in list structure, like Figure 7.The diagrammatic model indexes the information as a network of nodes that specify the associations of objects and ropes; for example, ropes p, q and x are associated with pulley A.
During problem solving both models search their stored information to match against the rules.When a matching rule is found its execution adds new information to the store.The matching process then begins again for another relevant rule, resuming from the sentence, in the sentential model, or the node, in the diagrammatic model, where the search last ended.This is where the critical difference between the two representations resides.For the sentential representation with its arbitrarily ordered list of sentences, the search laboriously examines sentences in order, matching each one against the rules.Thus, in the worst case, it may take a cycle through the whole list to find a complete set of matching sentences.In contrast, the diagrammatic model traverses the network of nodes  by searching nodes that are adjacent to the last examined node.This typically yields matching rules, because pieces of information that are needed for inferences are often spatially co-located within diagrams.Thus, Larkin and Simon [26] conclude that one of the benefits of diagrams is such spatial indexing of information.Cheng [8] followed up the computation study with an experiment in which human participants solved pulley problems like Figure 6 and found that the task was solved approximately five times faster with the diagram compared to the sentential representation.Larkin and Simon's other example is in geometry problem solving and that model demonstrated the advantage of diagrams in relation to the recognition of information to match with inference rules [26].These explanations of the cognitive differences in search and recognition between sentence and diagrammatic representations is a theoretical corner stone in cognitive science studies of representations [20].Will RISN models of the diagrammatic and sentential representations be consistent with the computational and empirical findings?Will they provide additional explanations of the impact of the representations on task performance?

Diagrammatic representation
Figure 8 is the RISN model for the interpretation of the diagrammatic representation of the pulley system problem.The overall structure of the model is an R-dimension (C10) for various classes of physical objects.For most objects there are nominal scale sub-R-dimensions: weights (E3), ropes (E10); and pulleys (E17).Additionally, the ceiling is simply an R-symbol (E15) as it is solitary.Each of these classificatory R-dimensions has sets of R-symbols for the individual objects: weights (G2-4); ropes (G6-14); pulleys (G16-19).The weight R-dimension also has an anchored R-dimension for mass values, which is a ratio scale quantity.Most of the leaves of the model are R-symbols are systems that associate ropes with bodies (I4-15).There are also R-symbols as leaves for the values of the weights (I2, I4).Overall, the network of schemas is relatively straightforward.

Sentential representation
Although not a visualization, a RISN model of the sentences in Figure 7 can be built by treating component names, values, predicates and sentences as graphic objects.Clearly, they all have associated concepts, so are R-symbols.Figure 9 is such a model that presumes that sentences are composed of sequences of predicates and names.The names, values and predicates, as graphic objects, are referred to directly.Sentences are referred to by their number.The overall form of the model is straightforward with an overarching R-scheme that has four R-dimensions.The predicate R-dimension (C1) is a nominal classification of the predicates, which are R-symbols (M1-G5).The labels R-dimension (C11) is a nominal classification of physical objects, which are R-symbols (F6-N16).The values R-dimension (C17) is a ratio scale for the mass of the weights, which are R-symbols (N17, O18).The list R-dimension is an ordinal scale that differentiates the sentences, which are R-symbols (P1-Z18).The octagonal place holder (D4) under the List R-dimension stands for all 26 links that should be drawn to the sentence R-symbols, but are omitted for clarity.The links in the model encode how each sentence is composed of selected predicates, labels and values.Some effort has been made to arrange the icons in Figure 9 to minimize the crossing of links, but clearly the model for the sentential representation has tangled mass of links compared to the model for the diagrammatic representation in Figure 8.

Discussion
Models for the diagrammatic and sentential representations in Figures 8 and 9 capture the conceptual and graphical structure of Figures 6 and 7, respectively.Table 1 summarizes the network properties of the two models.The diagrammatic and sentential RISN models are comparable in their number of higher order schemas (not tabulated) and the depth between the root and leaves.However, in all other respects the properties of diagrammatic RISN model indicates the diagram is superior.The size of the comparative complexity of the two models matches our intuitive sense that comprehending the diagrammatic representations should be easier than the sentential and, critically, for our present argument, it is consistent with Larkin and Simon's [26] claim and the experimental study [8].
The purpose of a RISN model is to capture the interpretive structure that users develop in memory as they perform a task with a target representation.The links between schemas are cognitive associations.So, the models can be used to examine the order in which information is attended to during problem solving, something that was not considered in [10,13,37].In the pulley problem, we examine paths along the links between the schemas, starting from the known weight and following through to the unknown weight.This traces the problem solver's search for information to match inference rules.In the RISN diagrammatic model the problem solver would start at the W1=1 R-symbol (I1) and attempt to reach the unknown weight (I3), applying inference rules along the way.The first steps of the path would step through weight W1 (G2), weight system W2 (G4), rope p (G6), pulley-system A. As there are two other links to rope R-schemas departing from pulley system A, both paths would need to be explored.The problem solving with the sentential model uses the labels and values to traverse through successive sentences.The problem solving would start at sentence 1 (N1), which is the given value, and follow the links and highlights the substantial difference in the number of associathrough to sentence 13 (Y8), weight W1 (M15), sentence 3 (R2), rope p tions between concepts that must be handled with the sentential (E7), then sentence 16 (X11) that is a pulley system, and so forth.The representation.full problem-solving paths through Figure 8 and Figure 9 mimics The most notable structural difference between the two models Larkin and Simon's [26] production system diagrammatic model is the neatness of the coordination of the information about the physical objects in the diagrammatic model compared to the mass of links in the sentential model.The general form of Figure 8 and 9 stand as iconic images of the benefits of locational indexing in diagrams versus the distributed indexing of information across sentential representations.

OTHER CASE STUDIES
Two other case studies have been conducted.The overall patterns of analysis and outcomes are similar to the detailed comparisons above, so they are just summarized.

Function versus phase plot line graphs
This case study provides a greater challenge to RIST than the previous two, because it compares two alternative designs of the same class of visualization -line graphs -rather than different classes of visualizations (Sankey vs. Chord) or wholly different classes of representations (diagrams vs. sentences).In a study that combined computational modeling of eye movements and conventional task performance measures, Peebles [29] found that phase plot line graphs to be superior to function line graphs for time series data for two variables, each with ten values.In the function line graph the x-axis was time and the two variables were plotted on the y-axis.In the phase plot the variables were each plotted on one axis and with datapoints labelled with timestamps.The function graph has two curves and the phase plot one.Across three types of value lookup questions user accuracy on the two visualization was comparable but the response times for the phase plot were reliably faster, statistically, but the differences were not substantial.RISN models were created for both visualizations.Both naturally have coordinate schemes which index the data but they differ in their encoding of two variables, which means that the function graph has double the number of schemas for variable values.The properties of the RISN models are listed in Table 1.Consistent with the findings [29], the contrast between the models suggests that phase plot should be superior.However, the advantage is more marginal across the properties, compared to the previous two cases, and similar for the link complexity and homogeneity.This more marginal difference reflects the relatively weaker effects found in [29].

Conventional electric circuit representations versus AVOW diagrams
This case study is interesting because it involves a visualization that was specifically designed to be superior for problem solving and learning than the conventional visualization and notation of the target domain.The domain is electrical circuit analysis.Students typically learn to analyze circuits by drawing a circuit diagram and manipulating equations for the laws of electricity.A novel visualization for electric circuits was designed that uses nested rectangles to encode the properties and laws governing the individual components and networks [7,12].The visualization is called AVOW diagrams (Amps, Volts, Ohms, Watts).The properties of a rectangle represent electrical properties (i.e., height-voltage (V), width-current (I), gradient of the diagonal-resistance (r), areapower (P)).The geometric relations of rectangles encode Ohm's law (height/width = gradient ⇔V/I=r ) and the Power law (area = height * width ⇔ P=V*I).A model of a circuit has one AVOW rectangle for each resistor (or subnetwork) in the circuit and the rectangles are assembled as a stack for series resistors and side-byside for parallel resistors, in a recursive fashion.A correct model must have no gaps or overlap of AVOW rectangles, which encodes Kirchhoff's laws for the conservation of current and distribution of voltage.Empirical studies in the lab [7] and in schools [12] showed that students could learn to solve electric circuit problems with 120 minutes of instruction with AVOW diagrams and performed substantially better than students given the same time learning with the conventional representations.Notably, students using AVOW diagrams could solve challenging transfer problems that students under the conventional approach were barely able to start.RISN models were built for the two representations, AVOW diagrams versus circuit diagram plus equations, for a circuit with three resistors -two in parallel that are together in series with the third.The AVOW diagram RISN model consists of a coordinate system for the electrical properties and a coordinate system for the configuration of rectangles that identify the properties of individual and composite AVOW rectangles.The overall structure is not particularly complex and highlights how each component in an AVOW diagram simultaneously encodes both component identity, properties, and relations among subcomponents.The RISN model for the other approach has two sub-representations for the circuit diagram and the equations.The sub-representation is rather like RISN model of the pulley system problem (Figure 8), and the sub-representation for the equations is like the sentential representation of the same problem (Figure 9).The two sub-representations share variables, so the model has many links running from both sub-representation to the R-schemas for the variables.This model is more substantial and complex than the AVOW diagram model: their RISN model properties are noted in Table 1.The interpretation of the RISN models is in accord with the empirical studies in [7,12].

OVERALL DISCUSSION
Representational Interpretive Structure Theory, RIST, is an account of the mental representations that users of visualization may build when they interpret a representation.The four case studies of pairs of informationally equivalent visualizations and representations are a demonstration of the utility of RIST.Table 1 summarizes the properties of RISN model structures that were built for the case studies.The properties, as claimed in Section 2.3, are indicators of the relative information processing efficacy of visualizations.We have a positive answer to the first question from the Introduction: in all four cases, the visualization that were found to be superior in the prior empirical and computational studies are also superior in our analysis.This is true across most of the criteria; and where they are not superior, they are at least similar.We can consider these predictions to be unequivocal as all the criteria point to the same representation, or are equal, so there is no requirement to consider interactions among the criteria that would weaken the judgement (see Section 2.3).In turn, the consistency of this finding over the four cases gives some reassurance that RISN model visualization efficacy criteria proposed in Section 2.3 are reasonable.
The second question from the Introduction concerns the comparability of the explanation derived from the RISN models and the explanation given with the original studies can be affirmatively answered for the pulley system problem case.Paths in RISN models can be used to trace the likely patterns of search for information to match inferences that occur during problem solving, as described in Section 5.4.This suggests an intriguing possibility.RISN models could be use as visualizations of the patterns of the information processing steps followed by computational cognitive models of problem solving with visualizations, and thus make such computational models more accessible to less specialist users (c.f., [29]).
The third question about the potential of RIST to provide new insights into the efficacy of visualization in information processing terms is suggested by the comparison of the case studies.For the pulley diagram-sentential representation pair, the high-level conceptual structure of the interpretation is similar, but they differ in terms of the number of R-symbols and their associated links (compare the top half of Figures 8 and 9, and their bottom halves).This suggests the diagram and list of sentences are relatively similar in abstract representational terms, despite their very different formats, and that differences in their impact on cognitive performance might primarily be attributed to the number of R-symbols and links.In contrast, which speaks to our third question, the overall conceptual structure of the RISN models of the Sankey-Chord pair is quite distinct between the root and the leaves (compare the middles of Figures 4 and 5).Interestingly, this implies that Sankey and Chord diagrams are dissimilar in abstract representational terms despite them both being diagrams and certainly more comparable in format than the pulley diagram and list of sentences.This suggests that the relative impact on cognition of Sankey versus Chord diagrams should be attributed more to the underlying differences in their information encoding structures than their surface level perceptual attributes; c.f., [18].In other words, RIST appears to provide a basis for the analysis of the inherent nature of visualizations that is not tied to the specific descriptions of domain content (data structures) nor to descriptions of the visual form of visualizations.
The form of the RISN models depends on how each modeler conceptualizes how the users interpret the visualizations for a given task.For instance, an alternative model for the Chord diagram was mentioned in Section 4.4.Section 3 outlined how we attempted to make the process of model building as unbiased as possible.Nevertheless, future work should include empirical studies with modelers beyond the authors of this paper.In particular, tests of RIST should be conducted with modelers who are unaware of the known differences in efficacy of the alternative visualizations.
There are good reasons for thinking that RISN modeling is far from arbitrary, because a model has multiple sources of constraints.The domain and task goal determine what concepts are relevant.The visualization has conventions about what graphic objects are actual symbols and how they are related, and some graphic objects must be considered just because they are visually salient.A fundamental purpose of RISE is precisely to operationalize the construction of models in RISN that conform to interpretive relations that are theoretically permitted under RIST.Under these constraints, the degrees of freedom for model construction is limited and it may be hypothesized that alternative models of the same domain and visualization may just reflect correct variations of interpretation.This hypothesis will be an important focus for future studies.Correctness is conformity to the concepts of the domain and visualization conventions, that if violated would result in a model of a misinterpretation.Thus, the RIST approach provides the interesting possibility to study how users may fail to understand visualizations and how to reduce that risk.
For the sake of clarity about the scope of the findings, it has been shown that RISN models are predictive of cognitive performance, in information processing terms, but no claims are made about the implications for other cognitive factors.In the case of cognitive load and visual literacy, they were taken as areas in which explicit modeling assumptions had to be specified.However, as RISN models identify what concepts and graphic objects are salient, and elaborate the memory structures that organize them, it is intriguing to speculate whether alternative and richer criteria to those given in Section 2.3 might be predictive of the cognitive load or visual literacy demands of a representation, or even factors such as discoverability or memorability.These are areas for potential future investigation.
A limitation of our extension to RIST in terms of the proposed efficacy assessment criteria (Section 2.3), is that they cannot be used to derive estimates of the cognitive performance costs associated with individual models of interpretations.The present study was feasible because it compared representations whose models happened to be consistently differentiated in the same direction on a subset of the criteria, with all the other criteria being similar.Future work should investigate what and how performance estimates might be obtained from the analysis of the detailed structure of RISN models.
To motivate the importance of interpretation in the study of information visualization, an analogy to the field of linguistics was made in the Introduction.To conclude, one might wonder about the relative maturity of the study of interpretation in linguistics relative to information visualization.One reason may reside in underlying sequential structure of textual and verbal representations.Sentences are linear concatenations of symbols, mainly words, whose meaning is embedded in grammatical rules of language.In comparison, visualizations and diagrams are inherently multidimensional in nature.In addition to the exploitation of two-dimensional spatial and geometric properties for representational purposes, visualizations also deploy shape, pattern, and color dimensions.So compared to text, the diversity of visualization forms is enormous.And, unlike the definition of linear concatenation for sentential representations, there is no simple underpinning definition for all classes of visualizations, although Larkin and Simon [26] propose one for diagrams (see section 4 above).Clearly, investigations of interpretations in information visualization will be more challenging than it is for linguistics.

Figure 1 :
Figure 1: (A) Date and time display.Blue labels are annotations.(B) A RISN model of the display.Green labels are names of types of schemas.The graphic object codes at the bottom of schemas may be annotations (e.g., DAT, SCL), descriptions of graphic objects in parentheses (e.g., "(String)"), or facsimiles of the graphic object in double quotes (e.g., "Apr 15 | 06:10").

Figure 2 :
Figure 2: A Sankey diagram.The light blue labels are annotations for the RISN models.

Figure 3 :
Figure 3: A Chord diagram.The light blue labels are annotations for the RISN models.

Figure 6 :
Figure 6: Diagrammatic representation of the pulley system problem.

Figure 7 :
Figure 7: Sentential representation of the pulley system problem.

Figure 8 :
Figure 8: RISN model of the interpretation of the pulley system diagram.

Figure 9 :
Figure 9: RISN model of the interpretation of the sentential representation of the pulley system.

Table 1 :
Properties of the RISN model network, for the four pairs of visualizations and representations in the case studies.For each property in each pair the value that implies the superior visualization is in bold.