Generic Programming with Extensible Data Types: Or, Making Ad Hoc Extensible Data Types Less Ad Hoc

We present a novel approach to generic programming over extensible data types. Row types capture the structure of records and variants, and can be used to express record and variant subtyping, record extension, and modular composition of case branches. We extend row typing to capture generic programming over rows themselves, capturing patterns including lifting operations to records and variations from their component types, and the duality between cases blocks over variants and records of labeled functions, without placing specific requirements on the fields or constructors present in the records and variants. We formalize our approach in System R𝜔, an extension of F𝜔 with row types, and give a denotational semantics for (stratified) R𝜔 in Agda.


INTRODUCTION
The goal of extensible data types is to bring type safety to modular software development.Row types [Rémy 1992;Wand 1987] are one approach to that goal.Rows express the structure of records or variants; row polymorphism captures properties like subtyping while maintaining a purely parametric approach to typing.Row typing was originally designed to model object-oriented inheritance, but its applications include: extensible variants in OCaml [Garrigue 1998]; extensible effects [Lindley and Cheney 2012]; typing algebraic effects and handlers [Hillerström and Lindley 2016;Leijen 2014Leijen , 2017]]; and, extensible protocols in session types [Lindley and Morris 2017].
This paper explores generic programming over rows.Consider defining equality functions for extensible records.Of course, given a particular set of fields, and knowledge of how to compare the field types, existing row type systems can express the equality function for records of those fields.Even with metaprogramming support, however, having to explicitly define equality functions for each record type (and each extension of a record type) creates a significant burden for programmers-a disadvantage for an approach designed to encourage this style of programming!Moreover, approaches that depend on particular sets of fields cannot extend to row polymorphism, Eq : ★ → ★ Eq = t.t→ t → Bool eq Σ : ∀z : R ★ .Π(Eq z) → Eq (Σz) eq Σ = d v w. ana ( l y.(case l ( x. sel d l x y) ▽ const False) v) w Fig. 1.Comparing extensible variants in R a key contributor to the expressiveness of row types.While we could express the extension of a particular record type, we could not (modularly) express that such an extension supports equality.
We propose novel record and variant operations, generic in the particular labels that appear in those records and variants, and realize these operations in System R , a core calculus that extends System F with row types based on R [Morris and McKinna 2019].Consider the equality function for extensible variants: if we know how to compare the values at each constructor in two variants, we ought to know how to compare the variants.Figure 1 captures this idea in R ; we have elided type abstractions, applications, and annotations on bound variables, as they can be inferred from the given type signatures.Type operator Eq maps types to equality operators for those types.Function eq Σ compares two variant values v and w, given a record d of comparison operators for their fields.Suppose that z is instantiated with the row {a ⊲ Int, b ⊲ List Bool}: d will be a record of comparison functions Π{a⊲Int → Int → Bool, b⊲List Bool → List Bool → Bool}, and v and w will each be variants Σ{a ⊲ Int, b ⊲ List Bool}.(We follow Pottier and Rémy [2005] in implicitly lifting operators on types, like Eq, to the corresponding operators on rows.)Our key novelty is the ana combinator: ana f w analyzes variant w, calling f with its constructor label l and contents y.With these in hand, we can then rely on the variant branching combinator (▽) of R : in case v is constructed with label l, we select from d the l-labeled function, and use it to compare the contents x and y of the two variants; otherwise, regardless of the contents of v, we can return False.We will return to each component of this definition in the remainder of the paper.
The generic operations in R build on the row type theory R .R is distinguished from other row type theories by two features.First, R uses qualified types [Jones 1994] to capture the structure of row types, rather than incorporating the structure of rows directly into the types of records and variants.This indirection makes it possible to capture structural invariants in R that are difficult or impossible to capture in other row type systems.We rely on this expressiveness in typing the combinators in R : the function argument to ana, for example, is typed given the assumption that l labels a value of type u in row z.Second, R builds on a general account of rows as partial monoids, encompassing a variety of different row type theories in the literature.While we will fix a particular theory of rows in our formalization of R , we will show how our account would generalize other theories of rows as well.
One way to realize the behavior of eq Σ would be to treat (hashes of) labels as keys at runtime: variants would be labeled by these keys and records would be dictionaries over keys.While direct, this approach is neither practically nor theoretically sound: we would be disappointed to learn that selecting a field from a record was not a constant time operation, and it would be difficult to show that well-typed operations only relied on keys that were dynamically present in records or case blocks.We will show that R has a type-safe implementation with no runtime comparison or manipulation of labels, guided by the use of predicates in the types of the generic operations.To do so, we will give a denotational semantics for universe-stratified R typing derivations in Agda, in which rows are interpreted as functions from finite naturals to types, row inclusions and combinations are witnessed by maps from finite naturals to finite naturals, and we have static guarantees that our indexing of records and variants is well-typed.To summarize, this paper contributes: • The extension of R to generic programming over rows ( §3), particularly the design of combinators that express generic transformations of row-typed products and sums; • A formalization of our approach in the R calculus ( §4), which extends System F with R -style row typing, first-class labels, and generic programming over rows; and, • The denotation of R derivations in Agda ( §5), showing that R is sound and need not introduce runtime manipulation or comparison of labels.
We begin with a review of extensible datatypes in R ( §2) and conclude with discussions of related ( §6) and future ( §7) work.

EXTENSIBLE DATATYPES AND THE ROSE TYPE THEORY
The goal of row typing is to support type-safe extensible data types.This section gives an intuitive overview of row typing, and the R type system in particular, preparatory to its extension in the following section.

The Need for Extensibility
Existing functional language type theories are remarkably expressive, and further additions are rightly viewed with some suspicion.We begin with two examples of the additional value of extensible data types.
The expression problem.Wadler [1998] describes the expression problem as "a new name for an old problem".Consider an abstract data type along with several operations.For example, we could have a simple type for arithmetic expressions, consisting of constants and sums, along with an operations to reduce expressions to integer values.The challenge is to extend this in two dimensionssay, by adding a new constructor for products, and a new operation to print expressions as character strings-without rewriting or recompiling existing code, and without compromising type safety.In modern functional languages, adding new operations is easy, but adding new constructors requires changing the original type definition and all the existing definitions.In object-oriented languages, adding new cases is easy but adding new operations requires changing the base class and all of its inheritors.Programmers in either camp must resort to encoding tricks to capture the remaining case, making code more difficult to read and maintain.
Modular transformations.The expression problem may not seem entirely compelling: why artificially restrict a common refactoring operation to preserve existing code?As an alternative view of the same problem, consider desugaring or optimization passes in a compiler.We might hope to limit many of our passes to operating on a subset of the whole language [Keep and Dybvig 2013;Sarkar et al. 2004]: for example, a single pass might resolve infix applications, while not changing the remainder of the syntax tree.In writing these passes, we would like to make them generic over the untouched (or only recursively transformed) parts of the syntax tree.This will both make the compiler more readable and maintainable, and provide type-based guarantees of the limited scope of these passes.This problem is essentially the dual of the expression problem: instead of planning to extend our AST, we hope to write passes without fixing most of the AST.

Row Types
Consider the type of a function that selects the field x from a record-we might write this function r.sel r x, where sel is our function for selecting record fields, and we use the teletype font to distinguish label constants from label variables.We can imagine many record types which might contain an x field-points on a plane, or in space; pixels on a screen; nodes for lambda expressions in the AST of a functional programming language-and in each case, x might have a different type.A general type for this function ought to encompass all its possible arguments, associating each with the corresponding result type.This problem, along with its dual for variants, is the starting point for row type systems.
Rows and row polymorphism.A row is an association of labels to types.For example, we write {x ⊲ Double, y ⊲ Double} for the row that associates both the labels x and y with the type of doubleprecision floating point numbers.Record and variant types are constructed from rows; for example, a type for Cartesian coordinates is Π{x ⊲ Double, y ⊲ Double}, while a more general type for points is Σ{Cart ⊲ Π{x ⊲ Double, y ⊲ Double}, Polar ⊲ Π{r ⊲ Double, theta ⊲ Double}}.
Just introducing rows gets us little closer to solving our initial problem: we can say that r.sel r x could have type Π{x ⊲ Double} → Double or Π{x ⊲ Int, y ⊲ Int} → Int, but these are not a general account of its behavior.Instead, this function should have a polymorphic type.In many row type systems [Rémy 1989;Wand 1987], its type would be written similarly to ∀t z.Π{x ⊲ t | z} → t.The syntax {x ⊲ t | z} denotes the extension of row z with the field x ⊲ t.As a whole, the type denotes a function from a record containing any fields z, and also x⊲t, to a value of type t.Similarly, a function that added a new x field to an existing record could be given a type like ∀t z. t This account of row types leaves several questions.First: in the types above, can the instantiation of z already include an association for x?
• Wand [1987] allows free instantiation of z; extension is then interpreted as overwriting the existing meaning of fields (in both types and terms).
• Rémy [1989] uses the kind system to preclude conflicting meaning of fields, but must introduce a new kind to capture functions which can either overwrite or extend objects.
• Berthomieu and le Moniès de Sagazan [1995] and Leijen [2005] allow free instantiation of z, and interpret extension as shadowing the existing meaning of fields, such that the original meaning can be recovered later.
Second: does this account generalize from single field extension to arbitrary concatenation of objects?For example, given two records, one of location data and one of color data, can we combine them to form a single record of colored location (or located color) data?
Polymorphism and predicates.Wand [1989] proposes the following term as a test of row type systems with record concatenation: m n. sel (m + + n) k Here m and n are arbitrary records, and the function projects the field k from their concatenation.(In Wand's original example, m and n were records of method implementations, k is a method name, and the term as a whole models multiple inheritance.)The crux of the problem is that if we have to assign a type to either m or n that already commits to field k, then we have over-specified the behavior of the function.On the other hand, if we do not commit to either m or n containing field k, then how can we be sure the function is well-defined at all?This is the starting point for the R type theory [Morris and McKinna 2019].Instead of capturing the structure of rows directly in the types of records and variants, R captures them using predicates in qualified types.For example, in R , the type of the x-selection function would be expressed as ∀t z. {x ⊲ t} z ⇒ Πz → t That is: this is a function that maps z-shaped records to t results, for any types t and z, such that the singleton row {x ⊲ t} is contained in z.R supports concatenation of records via predicates as well.The type for Wand's example in R is:  2. Typing of record and variant operations in R That is: this is a function that maps a z 1 -shaped record and a z 2 shaped record to a t result, such that z 1 and z 2 can be concatenated to give row z 3 , and z 3 contains the singleton row {k ⊲ t}.This type captures the full generality of Wand's challenge: we do not overconstrain either m or n to always provide field k, but still guarantee that the projection will always be well-defined.
Figure 2 gives the typing rules for the record and variant operations in R .The projection and injection operators are the generalizations of record selection and variant construction; each relies on being able to prove that one row is contained in another.Branching (M 1 ▽ M 2 ) is dual to record concatenation (M 1 + + M 2 ): it combines eliminators for two variants to give the eliminator for their combination.As with concatenation, it relies on being able to prove that the two smaller rows can be combined.

Theories of rows. As R
does not commit directly to the structure of rows, but abstracts their structure via the containment ( 1 2 ) and combination ( 1 ⊙ 2 ∼ 3 ) predicates, it can be adapted to any of the different notions to row extension: • To capture non-overlapping rows: we stipulate that 1 ⊙ 2 ∼ 3 is only satisfiable when 1 and 2 have no fields in common.For this approach, we can define 1 2 to hold either when there is some ′ such that 1 ⊙ ′ ∼ 2 or when ′ ⊙ 1 ∼ 2 .
• To capture overwriting: 1 ⊙ 2 ∼ 3 is always satisfiable, where 3 reflects 1 for any labels that appear in both.We can define 1 2 to hold exactly when there is ′ such that 1 ⊙ ′ ∼ 2 .(On the other side, when ′ ⊙ 1 ∼ 2 , we cannot necessarily recover fields in 1 from the combination 2 because they may have been overwritten by fields in ′ .) • To capture shadowing: 1 ⊙ 2 ∼ 3 is always satisfiable, and we get two containment predicates, 1 L 2 ⇐⇒ 1 ⊙ ′ ∼ 2 and 1 R 2 ⇐⇒ ′ ⊙ 1 ∼ 2 , with corresponding injection and projection functions.R itself is defined generically over a row theory, which defines the underlying structure of rows and interpretation of row predicates.So, R encompasses all of the above cases, as well as both simpler (e.g., unlabeled) and more complex (e.g.modules) cases.

Open Problems in Extensibility
Despite R 's expressiveness, it is still limited in how it describes individual rows.R can capture the structure of rows, but it has no predicates that capture properties of the types in a row.This limitation has several consequences.
If we know that every type in a variant or record supports equality comparisons, we should expect that the variant or record supports equality comparison as well.However, even expressing this problem is not possible in R -the constraint we need to express is on the types that appear in the row, not on the structure of the row itself.The problem recurs when considering higherorder polymorphism.Recall the example of modular AST transformations ( §2.1).To maximize flexibility and readability, the pass that transforms infix to prefix applications should not constrain the remainder of the syntax tree.However, this transformation is not only applied at the top level of expressions or definitions; it must also be applied recursively, regardless of the other nodes in the AST.This, in turn, implies some constraint (such as functoriality) on the remainder of the AST, which cannot be captured in R .
We know that records and variants enjoy strong duality properties: a case expression eliminating a variant corresponds to a record of functions, containing one (appropriately typed) function for each branch in the case expression.This duality is not just of theoretical interest.For example, in implementing a system of algebraic effects and handlers [Plotkin and Power 2003;Plotkin and Pretnar 2009], we could represent effectful computations as abstract syntax trees over operations, and handlers as records of implementations of those operations.We might then hope to define a general handling combinator, which combines an effectful computation with an appropriate handler.However, we cannot implement this operation in R : while we can use the same row variable to describe both records and variants (i.e., both computations and their handlers), the branching and projection operators all refer to specific labels.
Existing row type theories address some of these problems.Blume et al. [2006] distinguishes case blocks from functions, and realizes case blocks by records of functions in their semantics.However, this step in the semantics is not available to programmers.Pottier and Rémy [2005] implicitly lift operations on types to operations on rows: if z is a row of associations ℓ i ⊲ i , then z → is the row of associations ℓ i ⊲ i → .They further postulate an operation rapply which applies a record of functions to a record of (identically labeled) arguments, producing a record of results.However, this operation is treated as a primitive extension of their calculus.Chlipala [2010] includes a mapping operator on records in a calculus based on F , generalizing the lifting of Pottier and Rémy, and provides type-directed generation of record folding operations.He does not consider variants in his approach; moreover, it is not immediately clear that folds, and their duals for variants, would be sufficient to capture the open problems we identify.
imposes Hindley-Milner constraints on typing; R is based on System F extended with qualified types, and so supports first-class polymorphism and general type operators.More significantly, the record and variant operations in R are all specific to concrete labels or sets of labels; R introduces label-generic combinators.This section introduces R by example.
Through the majority of this section, we will assume simple rows: labels are restricted to appear at most once in a given row, row combination is commutative (and so there is a single containment operator), and 1 ⊙ 2 ∼ 3 is unsatisfiable if 1 and 2 contain any of the same fields.This is the most common approach to typing records and variants and rules out many unexpected behaviors.At the end of the section ( §3.5), we will discuss the specific challenges in extending our development to a non-commutative row theory.

First-Class Labels
In R , labels exist in types, but not in terms.The construction (ℓ ⊲ M) and destruction (M/ℓ) terms, which are overloaded for both singleton records and variants, are each essentially infinite families of terms, one for each label.To support label-generic operations, however, we will need to make labels first-class citizens in the term language as well as the type language.
To do so, we follow the approach used by Gaster and Jones [1996] and Sulzmann [1997].We have added a singleton type constructor ⌊−⌋ to R : if ℓ is a label type, then ⌊ℓ⌋ is the corresponding singleton type.(For a label constant L, we also write L for the unique inhabitant of ⌊L⌋.) First-class labels allow us to abstract several common patterns in R .For example, to select an individual field from a record, we first apply prj to project a singleton record and then use the singleton deconstruction operator.R introduced syntactic sugar for this pattern; in contrast, we can define the selection function directly in R by: sel : ∀l : L, t : ★, z : R ★ .{l ⊲ t} z ⇒ Πz → ⌊l⌋ → t sel = Λ(l : L) (t : ★) (z : R ★ ).(r : Πz) (g : ⌊l⌋).prj r/g (Note that predicate abstraction and application remain implicit in R .)The type abstractions and annotations in this example, and most of the following, can be determined from the type signatures alone, so we will generally omit them: Row type systems are frequently forced to distinguish between record extension (which adds new fields to existing records) and record update (which changes the value-and possibly type-of an existing field in a record), because their types impose different requirements on the input record type.Rémy [1989] introduces presence polymorphism, allowing a single term to play both roles at the cost of additional type system complexity.A single term that captures both in R : We treat ⊙ as a partial type constructor [Ingle et al. 2022;Jones and Diatchki 2008]: we write 1 ⊙ 2 as a type to denote a fresh type variable z under the constraint 1 ⊙ 2 ∼ z.Row z 1 is either the empty row or the singleton row mapping l to t; row z 2 is constrained to combine with {l ⊲ t}, so cannot contain label l.The input record, of type Π(z 1 ⊙ z 2 ) may contain field l (depending on the choice of z 1 ); the output record definitely contains l, mapped to type u.
First-class labels are also useful for capturing programming patterns with variants.We can define a generic function for constructing variants: The base case for the branching operator ▽ is a function that maps a singleton variant to a result.We can capture this pattern as well: Representing Booleans as Bool = Σ{True ⊲ Π{}, False ⊲ Π{}} (syntactic sugar for Σ({True ⊲ Π{}} ⊙ {False ⊲ Π{}})), we could then define the usual conditional by: Perhaps most surprisingly, while R lacked syntax or types for first-class labels, adding them does not require extending its semantics in any non-trivial way.The necessary information for the sel function, for example, is already captured entirely by the predicate {l ⊲ t} z.The value of type ⌊l⌋ provides no additional information-as you would expect for a value of a singleton type!

The Duality of Records and Variants
We begin our exploration of generic programming over rows with the duality between records and variants.This duality is foundational to row type systems in general, and to R in particular.Its introduction rule for variants and the elimination rule for records are clearly dual, and the rules for concatenating variant eliminators and concatenating records are nearly as evidently dual.(To make the duality more explicit, one could have defined a rule for combining record constructorsfrom → Πz 1 and → Πz 2 , obtain → Π(z 1 ⊙ z 2 )-but this seems to obtain theoretical elegance at the cost of usability.)In fact, we can witness this duality in R , but only for concrete rows.For example, we can define the following operations for the Boolean type: The type Cases B abbreviates operations over the constructors of the Boolean type.The reify B function transforms a function that scrutinizes a Boolean value into a record of functions, one for the True case and one for the False case; dually, the reflect B function uses such a record of functions to scrutinize a Boolean value.(We write () for the unique value of the Π{} type.)Knowing the constructors of the Boolean type is essential to writing this example; while such functions exist for any variant type in R , their definition would have to be repeated for each type.
In R , we can write generic versions of these operators, applicable to any variant type and the corresponding record of cases, as shown in Figure 3.The types of reify and reflect rely on lifting operations on types to operations on rows: if z is the row of types ℓ i ⊲ i , then z → t is the row of types ℓ i ⊲ i → t.In reify B and reflect B , we relied on concrete constructors in two places: when deconstructing a Boolean value in reflect B , and when building the record of constructors in reify B .R provides label-generic versions of these two operations, one for analyzing variants and a dual operator for synthesizing records.Here is our first attempt at their typing rules: We write R for the kind of rows over types of kind .To avoid a sea of metavariables, we combine kinding and typing assertions in Γ; the judgment Γ ⊢ : R ★ is a kinding assertion on , and Γ ⊢ ana M : Σ → is a typing assertion on ana M.
In ana M, the body M is a label-generic version of the cases in a branch expression: given a label l, a type u, and evidence that {l ⊲ u} appears in , M consumes a single case-(a witness for) the constructor, and its contents-and produces a result of type .If M can do so for any constructor appearing in , then ana M can consume a value of Σ to produce a result of type .We use ana in implementing reflect.Given the constructor label l and contents u of an arbitrary variant value w, we invoke the l-labeled entry from the record d with argument u.Again, lifting plays a central role: from {l ⊲ u} z, we can conclude that {l ⊲ u → t} z → t, and so sel d l is a u → t function.
In syn M, the body M is a label-generic version of the components of a concatenation expression: given a label l, a type u, and evidence that {l ⊲ u} appears in , M produces a value of type u.If M can do so for each label appearing in , then syn M can produce a record of type Π .We use syn in implementing reify.In the body, we have access to f : Σz → t.We build a new function u → t, which wraps its argument in constructor l and then invokes f .Lifting plays a similar role to its role in reflect: as {l ⊲ u} z, the result type includes l ⊲ u → t.

Transformations
Next, we consider generic transformations on extensible types.
Type-preserving maps.We begin with type-preserving mappings, such as reversing each field of a record of lists.Here is the version for records; the version for variants is nearly identical.
The mapped function is label-generic: for any label l and type u appearing in z, the function transforms the old u value into a new u value.Given such a function f and a record r, we synthesize a new record in which each field l contains the result of f applied to the old field and its label.
Type-transforming maps.The far more interesting case is type-transforming mappings, such as transforming a record of lists into a record of their lengths.The challenge here is not defining the term (in fact, it will turn out to appear identical to the previous term), but rather to find an appropriately expressive type.Suppose that we have type constructors List : ★ → ★ and Int : ★, such that the length function has type ∀t:★.List t → Int.We might then imagine that the pointwise length function on records would have a type like where const : ★ → ★ → ★ is the expected constant operator, on types.In the input type, we lift the type constructor List over the row z; this allows us to capture the idea of a row of list types.In the output type, we lift const Int over z; this replaces each type in z by Int.Instantiating this type with the concrete row {a ⊲ Bool, b ⊲ Char} would give Of course, we cannot inhabit this type with a term based on map ′ Π , as the input and output types are not identical.More seriously, however, it is not clear how we could inhabit it with any term based on our previous typing rule for syn.The only types in the output row are Int, and it is not clear how we could reconstruct an application of length to a field of the input row given only the information that l ⊲ Int appears in the output row.
Our solution is to generalize the types of ana and syn to incorporate a type operator :

Fig. 4. Transforming records and variants
Differences from the previous rules are shaded.We now allow to range over rows of arbitrary kind -we will make use of this in capturing functoriality later in the section-and require that be a type operator mapping from to ★.We then uniformly introduce in the uses of , both in typing results of ana and syn and in typing their body.Rules ( ana 1 ) and ( syn 1 ) are special cases of these rules, and going forward we will write ana and syn for ana t. t and syn t. t , respectively.
With the generalized typing rules for ana and syn, we can now define kind-indexed families type-transforming maps for record and variants, shown in Figure 4. We write X ( ) for a family of X's indexed by kind .We would expect languages based on R to also include kind-polymorphism; we have omitted it from our formalization simply to avoid an orthogonal source of complexity.The type Iter ( ) f g z captures iterated functions over row z; type operator f is used to construct the input type, and g is used to construct the output type.We make the type abstractions in map ( ) Π and map ( )  Σ explicit, as we will need to refer to the abstracted types in the calls to syn and ana.The implementation of map ( )  Π is almost identical to the implementation of map ′ Π .The crucial difference is in providing the operator g to syn.This means that the body of syn has the type ∀l : L, u : .{l ⊲ u} z ⇒ ⌊l⌋ → g u That is to say: knowing that l ⊲u appears in z, we must produce a value of type g u.The assumption is sufficient to conclude that l ⊲ f u appears in f z, and so sel r l is a suitable input to the iterated function i : Iter ( ) f g z.
The implementation of map Σ is the expected dual of the implementation of map ( ) Π .We annotate ana with the input-side operator f , so its body has the type Here we are immediately sure that the value x is a suitable input for i; from {l ⊲ u} z we have {l ⊲ g u} g z, and so con l (i l x) can be of type Σ(g x).
Pointwise application.Pottier and Rémy [2005] describe a pointwise-application operator for records, which maps a record of functions and record of arguments to a record of results.We can describe a similar family of operators in R , as follows:  Functor : The type Xf ( ) describes the individual transformation functions; as we expect to have a record of these functions, suited to their record of arguments, we do not have to describe them in a labelgeneric way.The rapply ( ) function then takes a record of such transformers (note that we lift Xf ( ) f g from an operator on to an operator on R ) and a record of arguments, and produces a record of results.Its implementation is a direct application of map Π , in which the body need only look up the appropriately labeled function in the input d.
Our rapply is not quite the same as Pottier and Rémy's: where we rely on type applications f z and g z based on a single row, they define a pointwise lifting of the function constructor to rows z 1 → z 2 .However: their rows are infinite, with a default type for all labels not mentioned in the row; correspondingly, their records are infinite, with a default value for all labels not mentioned in building the record.This means that z 1 → z 2 can always be well-defined, by using z 1 's default type as the domain for any labels not mentioned in z 1 and z 2 's default type as the codomain for any labels not mentioned in z 2 .With finite rows, we do not have the same luxury.Should we interpret z 1 → z 2 as undefined if the label sets of z 1 and z 2 are not identical?Or restrict it to the intersection of those label sets?The former would introduce additional partiality in the type of rapply, while the latter would seem to make rapply impossible to define.Without a more compelling application of this additional flexibility, we have limited ourselves to lifting type operators over rows.

Li ing functoriality.
A more substantial application of the map functions is in lifting functorialityas realized in languages like Haskell-to records and variants.The idea is that if we have a row of type constructors, where each constructor in the row has a suitable mapping operator, then we can derive mapping operators for record and variant type constructors built from that row.Our implementation is shown in Figure 5.
We begin by defining the Functor type operator.This should be read as capturing the evidence that a type operator is a functor: Functor List, for example, is ∀t u.(t → u) → List t → List u.
We turn to the types of fmap Σ and fmap Π .We abstract over a row z of type constructors.Lifting Functor over z gives a row of types, so Π(Functor z) is a record of evidence that each constructor in z is functorial.Now, we want to make a claim about record and variant types built from z.To do so, we generalize Π and Σ to families of type constructors, where for z : 1 → 2 we write Σz for the type constructor t.Σ(z t) and similarly for Π.This generalization is not necessary-we could write the constructors out-but this abbreviation seems intuitive, and makes the types of fmap Σ and fmap Π natural.
Finally, we can implement fmap Σ and fmap Π directly using map Σ and map Π ; in each case, the mapped function simply looks up the appropriate evidence in d, then applies it to lift f over x.

Comparing Records and Variants
We continue exploring component-wise operations on variants.Our goal is to compare values of two variant types, given that we can compare the values at each of their constructors.Our intended code in shown in Figure 6.We begin by defining the type operator Eq, which captures equality Eq : ★ → ★ Eq = t.t→ t → Bool eq Σ : ∀z : R ★ .Π(Eq z) → Eq (Σz) eq Σ = d v w. ana ( l y.(case l ( x. sel d l x y) ▽ const False) v) w comparisons (actually, any binary comparison); given a row z, Π(Eq z) is a record of comparison operators for each type in z.To compare two values v, w of type Σz, we begin by analyzing w.We can then fall back on the branching combinator of R : if v is also built with constructor l, we can compare their contents using the l field of d; otherwise, the two are definitely unequal.
The only difficulty with this implementation is that it does not type.Consider the branch expression in the body of ana.As v : Σz, we must show that the two branches combine to give z.However, all we know is that {l ⊲ u} z; while logically this implies that there must be a "remainder" of z less {l ⊲ u}, we do not have access to it.
Our solution is to update the typing rules for ana and syn, generalizing the type of the body.
The changed components of the rules are shaded.Instead of providing evidence that {l ⊲ u} , we now decompose into {l ⊲ u} and a row type y.The previous iteration of the rule is a special case of this one.With this rule, our intended implementation of eq Σ is well-typed.
Unfortunately, the solution for variants does not obviously dualize to give a comparison operator for records.Again, assume we have comparators for each field.The operators we have discussed so far would allow us to build a record of Booleans.However, for the records to be equal, we must then determine whether those Booleans are all true, and (without knowing the specific fields) we have no tools to do so.
To capture functions like these, we introduce a folding operation over records: The term M 1 is a label-generic mapping from the fields of the input record N : Π to the result type ; M 2 combines values of type , and M 3 is an identity for M 2 , used for folding the empty record.Given this folding operator, we can define equality comparison for records, as shown in Figure 7.
Introducing this operator immediately raises several questions.For example: in what order are the mapped fields passed to the folding function?Is the identity included once?At all?And so forth.Our conclusion is that the values passed to fold must follow the same rules as the underlying row theory.Following Morris and McKinna [2019], row theories must be associative and have the empty row as their unit; thus, M 2 should be associative and have M 3 as its unit.For a commutative row theory (as we have been assuming), M 2 should be commutative as well.For a non-commutative theory, on the other hand, fold would pass values to M 2 consistent with the ordering of fields in the row.And so forth.Absent these constraints, the exact behavior of fold ought to be unspecified.
The dual operator for variants would be an unfold, generating a variant by unrolling a starting value.Introducing such an operator would raise all the same problems as we have for fold.As we have found no compelling uses for unfolding variants, we do not consider this operator further.

Generic Programming for Non-Commutative Rows R
encompasses multiple models of rows.Our discussion so far has assumed simple rows, a commutative row theory which seems to be the most natural approach to typing records and variants.However, other theories may be more suited to particular applications.For example, scoped rows [Berthomieu and le Moniès de Sagazan 1995; Leijen 2005], a non-commutative row theory, are particularly well suited to capturing algebraic effects and handlers.A language that includes both extensible data types and algebraic effects, then, might want to include both simple rows (for data types) and scoped rows (for effects).Alternatively, a language could support encoding algebraic effects via extensible data types, such as by using free monads.But then, to capture effects naturally, the language could support extensible data types over both simple and scoped rows!
The challenge in adapting our account to non-commutative row theories is that we no longer have a single idea of containment.The same label ℓ, or indeed the same labeled type ℓ ⊲ , may appear multiple times in a single row.To support non-commutative row theories, R introduced two containment operators: the "left" version, 1 L 2 , which holds if there is a ′ such that 1 ⊙ ′ ∼ 2 , and the "right" version, 1 R 2 , which holds if there is a ′ such that ′ ⊙ 1 ∼ 2 .
Unfortunately, neither of these is a drop-in replacement for the predicates in our generic operators, as individual entries need not be at either the beginning or end of the input row.We can apply a similar idea, by replacing the constraint {l ⊲ u} ⊙ y ∼ with y 1 ⊙ {l ⊲ u} ⊙ y 2 ∼ .(Note that we cannot define a corresponding "containment" predicate: l, u, and do not uniquely determine y 1 and y 2 .)As we only have a binary row combination predicate, we express this by These rules generalize those previously presented: in a commutative theory, if y 1 ⊙ {l ⊲ u} ⊙ y 2 ∼ , then there is a y such that {l ⊲ u} ⊙ y ∼ , given by y 1 ⊙ y 2 ∼ y, and conversely.

THE R CALCULUS
This section provides a formal description of the syntax and type system of R .As in Morris and McKinna [2019], R is parameterized by a row theory, giving the intended interpretation of rows.A row theory T is a triple ⊢ T , ≡ T , T , where • ⊢ T is a kinding relation, capturing when rows are well-formed; • ≡ T is an equivalence relation, identifying rows; and, • T is an entailment relation, giving the meaning of the and ⊙ predicates.
We write R (T ) to R instantiated with theory T .Our description of R syntax ( §4.1), types ( §4.2), and terms ( §4.3) are all given generically over an arbitrary row theory T .We then provide three concrete row theories.The minimal row theory ( §4.4) captures labeled rows, but makes no commitment to when (non-singleton) rows are well-formed.The examples in the previous section are all well-typed given only the minimal row theory.We then describe the simple row theory ( §4.5), which captures commutative Rémy-style rows, and the scoped row theory ( §4.6), which captures non-commutative Leijen-style rows.We develop the expected metatheory in the following section, when we discuss our denotational interpretation of R in Agda.

Syntax
The syntax of R (T ) is given in Figure 8. Kinds include types ★, labels L, rows R of kind , and type constructors → .Not all possible kinds are currently used in R .For example: while nothing prevents describing a type of kind R L (i.e., a row of labels), we have no primitives that operate on such a type, and indeed suspect that such a type would be very difficult to use ( §7).
Predicates include containment d and combination ⊙ ∼ .To account for non-commutative row theories, we include directed variants of the containment predicate; intuitively, if 1 ⊙ 2 ∼ 3 , then 1 L 3 and 2 R 3 .Given a commutative row theory, these predicates are equivalent.In a practical language based on R , we anticipate that the predicate language would be extended with other forms of predicates, such as type classes [Wadler and Blott 1989], linearity constraints [Gan et al. 2014;Morris 2016], or general equality constraints [Peyton Jones et al. 2006].
We let , , , and range over types; when possible, we use where we expect a type constructor, where we expect a row type, and where we expect a label.Standard type constructs include variables , constants (here only the function arrow), quantifiers, abstractions, and applications.Predicates appear in qualified types ⇒ .To incorporate labeling, we include labels (ℓ) themselves, singletons ⌊ ⌋, and labeled types ⊲ .Following R , we treat labeled types and row types independently.Finally, we include rows { 1 , . . ., n } (including the empty row), records, and variants.Well-formedness of concrete rows is delegated to the row theory T .We let M, N range over terms.Standard terms include variables, type and term abstractions, and applications.Introduction and elimination of qualifiers is implicit.To support labeling terms, we include label (singleton) constants ℓ and terms to label (M ⊲M) and unlabel (M/M).As the singleton record and variant types are isomorphic to their underlying single field or constructor type, we do not provide separate syntax to construct singleton records and variants.Finally, we include the (directed) variant and record operators of R , and the label-generic operators new to R (T ).
Environments track three kinds of assumptions: kindings of type variables : , typings of term variables x : , and predicates (as qualified type elimination is implicit, we do not need to name predicate assumptions).We combine these assumptions into a single context Γ simply to avoid a superfluity of (mostly unchanging) metavariables.
The kinding rules are mostly standard.Rule ( ) delegates well-formedness of rows to the row theory T .Rules ( Π) and ( Σ) capture the formation of record and variant types, lifted to arbitrary kinds .In the functor example (Figure 5, §3.3), we had a row of type constructors z : R ★→★ .Applying ( Σ), we can conclude that Σz : ★ → ★, and so that (Σz) t : ★.
Rules ( 1) and ( 2) license the lifting that has played a prominent role in our examples.
One might argue that these are simply syntactic abbreviations, and complicate the reading of types.Instead, we should follow the lead of Featherweight Ur [Chlipala 2010], and use an explicit map operation to lift types over rows.However, in developing our examples, we found that the extra weight introduced by a more explicit approach obscured the meaning of the terms.For example, contrast our types for reify and reflect (Figure 3, §3.2) with the more explicit reify : ∀z : R ★ , t : ★. (Σz → t) → Π(map ( (s : ★).s → t) z) reflect : ∀z : R ★ , t : ★.Π(map ( (s : ★).s → t) z) → (Σz → t) Or, similarly, contrast our type for fmap Σ (Figure 5, §3.3) with the more explicit But in the end, this is a matter of taste; restricting Σ and Π to arguments of kind R ★ and making row mapping explicit would not fundamentally restrict the expressiveness of R .
The type equivalence rules are shown in Figure 10.The first three lines are standard.The rules ( 1) and ( 2) realize the promise made in ( 1) and ( 2), lifting single type operators or type arguments to rows.Rule ( ) delegates equivalence of row types to the row theory T .Rule ( 3 ) gives Π and Σ their intended meaning at higher kinds.Finally, ( ) captures the isomorphism between singleton records, singleton variants, and their underlying field (or constructor) type.Again, this latter rule is not integral to R ; a more explicit version, with separate terms to introduce and eliminate singleton records and variants, would be just as expressive.

Terms
Figure 11 gives the typing rules for R .We have already developed its novelties in the previous section, but will briefly highlight the remaining features of the type system.Lines 1-3 contain a standard treatment of functions, qualified types, and quantified types.Rule ( ) is used to introduce label singleton constants, which can then be used to label ( ⊲I) or unlabel ( ⊲E) terms.Rule ( ≡) can be used (among other things) to move between labeled terms and singleton records or variants.The rules for projection, concatenation, injection, and branching are identical to the corresponding rules for R .Finally, the rules for analyzing variants and synthesizing and folding rows are discussed in the previous section.

Minimal Rows
Figure 12 gives the minimal row theory M.
The minimal row theory only includes singleton rows, and so R (M) can express very few practical uses of extensible data types.However, the minimal row theory captures the fundamental properties that all (labeled) row theories share.Our motivating examples ( §3) all type in R (M).

Simple Rows
Figure 13 gives the simple row theory S. The simple theory is a commutative theory, in which labels may appear at most once in any row; it captures the most common approach to row types, originally introduced by Rémy [1989].The challenge to expressing the simple row theory in R arises from first-class labels.As noted by Leijen [2004], among others, first-class labels can introduce surprising corner cases.Consider a type like Π{ 1 ⊲ Int, 2 ⊲ Int}, where 1 and 2 are types of kind L. This type only makes sense if 1 and 2 are guaranteed to be different labels.This restriction is captured in ( ): each pair of labels in a row must be different concrete labels.Of course, this condition is satisfied trivially for the empty and singleton rows.Nor does this requirement limit the use of first-class labels, as longer rows may always be expressed as concatenations of singleton rows-indeed, such an elaboration could be done automatically, treating rows as partial type constructors [Ingle et al. 2022;Jones and Diatchki 2008;Jones et al. 2020].
The entailment relation extends that of the minimal row theory with rules for concrete rows.In each case, the essential evidence is a mapping between rows in the predicate; as we will see in the next section, these mappings are exactly the information needed to implement the record and variant operations.There are more generic entailment rules that could be useful in a practical realization of R (s).For example, combination gives a least upper bound for the containment relation: Nevertheless, the rules we give here capture the essential properties of the simple row theory; we regard further extension of the entailment relation as an orthogonal concern.

Scoped Rows
Figure 14 gives the scoped row theory C.
The scoped row theory is a non-commutative theory, in which the left-most instance of a given label is preferred; it was introduced by Berthomieu and le Moniès de Sagazan [1995] and independently by Leijen [2005].Because labels can be repeated, there is no difficulty in the kinding rule ( ).However, more care must be taken in the entailment relation: we want to allow {y ⊲ Int} L {x ⊲ Int, y ⊲ Int}, as there is no harm in permuting distinct labels, while excluding {x ⊲ Bool} L {x ⊲ Int, x ⊲ Bool}, as this permutes identical labels.This is captured by the side condition on the permutations in each of the entailment rules, which requires that swapped labels be provably distinct.

INTERPRETING (STRATIFIED) R IN AGDA
We have two goals in defining semantics for R .Primarily, of course, is to demonstrate the soundness of R 's type system.Secondarily is to show that R need not introduce runtime dependence on or manipulation of labels compared to extensible data types without label-generic operators.
To accomplish both goals, we embedded R (M) typings in the Agda type theory, and then defined a denotational interpretation of those typings in Agda itself, interpreting the R function space as Agda functions, R records and variants as dependent products and sums with finite natural indices, evidence for containment and combination as maps between finite naturals, and so forth.In particular, labels are interpreted as the unit type, and the indexing of products and sums does not depend on the identities of labels in the source derivations.
While our mechanization of the entailment relation is limited to the minimal row theory, our denotations are not correspondingly limited to singleton rows, records, and variants.To the contrary, because our denotations do not depend on labels directly, they are sufficient for all the row where p permutes 1 . . .n, if i < j and p(i where p permutes 1 . . .n, if i < j and p(i) > p(j), where p permutes 1 . . .m + n, if i < j and p(i) > p(j) then ′′ i # ′′ j .
Fig. 14.Scoped rows: kinding, entailment, and equivalence theories discussed in this paper.Concretely: while the minimal theory provides no row z that satisfies the constraint x ⊲ Int ⊙ y ⊲ Int ∼ z, our Agda denotation includes both suitable instantiations for z and the evidence that they satisfy the constraint.Our claim of type soundness is semantic in nature and relies on the totality of Agda as a type theory: we show that the denotations of well-kinded types are in the denotations of their kinds, that the denotations of well-typed terms are in the denotations of their types, and so forth.Because our denotations are in a typed theory, we do not have a wrong value (as in Milner [1978]); instead, we extend the guarantees provided by Agda's type system to R .
This section gives a high-level overview of our Agda development; interested readers are referred to the full development [Hubers and Morris 2023].There are two significant threads.First: our specification of R so far is impredicative, while Agda is a predicative type theory.We address this by stratifying R , preserving its practical expressiveness while being suitable for embedding in Agda.Second: we need Agda definitions of the R primitives.With these out of the way, the remainder of the development was pleasingly straightforward, and demonstrates soundness of kinding, typing, and equivalence.

Stratifying R
Our first challenge is developing a predicative version of R .Following Dunfield and Krishnaswami [2013], we could identify the monotypes of R (those types without quantifiers), and limit quantifier instantiation to monotypes.However, this approach would unacceptably compromise the Γ ⊢ S 1 ⊙ 2 ∼ 3 : i with selector functions such as: However, to type return, we have to instantiate sel with the type of the return field, ∀t :★, t → m t, which is not a monotype.Instead, we follow the approach of System SF 2 [Leivant 1991], ensuring predicativity by stratifying the R type system.Each type in stratified R is associated with a level.We write (i) for the kinds of types at level i: The base kind ★ is now annotated with a level.Labels are types at any level, and the types of rows and type constructors are determined by their component types.We write for the union i ∈N (i) .The stratified kinding relation is shown in Figure 15.Overall, stratification has a relatively minor impact.Rule ( ≤) includes earlier levels in later levels; our mechanization incorporates this rule into the other rules.Rules ( ⇒) and ( ∀) ensure that the result type is at least one level higher than the level of the quantified type or predicate.The remaining rules are unchanged.However, note that we do now require that quantification and type abstraction explicitly mention the level of the quantified or argument type.In our mechanization, in turn, we can abstract derivations over the base level.Figure 15 also includes a stratified version of the predicate formation rule, tracking the level of types that appear in the predicate.
In mechanizing R kinds and types, we have separated the environment Γ into three: a kinding environment Δ, a predicate environment Φ, and a typing environment Γ.We use an intrinsicallykinded representation of types: We define interpretation functions for kinds, kinding environments, and types: These definitions are unsurprising.For example: the kind ★ i is interpreted as Set i; kinding environments are interpreted as tuples of types; the type ∀ : . in kinding environment H is interpreted as a dependent function (X : k ) → t (H, X ).Label singleton types are all interpreted as ⊤ (the unit type), buttressing our claim that R can be implemented without runtime manipulation or comparison of labels.The interpretation of row types, records, and variants is discussed next.
The interpretation of types gives a constructive proof of the following claim: T .The kind system of R is sound.
Of course, this is only convincing if the interpretations themselves are non-trivial.Here we rely on the underlying type theory: for example, as we interpret the kind of type constructors 1 → 2 as Agda functions 1 k → 2 k , we can be confident that our interpretations of types of that kind are meaningful.For the full details, please see the Agda development [Hubers and Morris 2023].

Rows and Indices
We intend our interpretation of records and variants to be both type-safe, and to align with the intuition of those types.That is, a record should be a sequence of its field values, and a variant should be a single tagged value.We begin with rows themselves.Intuitively, a row is a sequence of types.Our encoding in Agda is almost that direct: That is to say: a row at level i is a dependent pair of its length n and a map from finite indices less than n to types at level i.We can define record and variant constructors (at type ★ i ) as dependent functions on rows: (We will rely on some overloading to avoid tedious qualified names: Σ followed by a variable binding is the dependent sum constructor; followed by a row, it is the variant constructor.)In each case, we pattern match on the input row, obtaining its length n and a mapping from indices to types P. A variant is the expected tagged value, pairing a tag less than n with a value of the type indexed by the tag.A record is another dependent function: given an index into the record, it returns a value of the type at that index.
We have made one simplification relative to R : we implement records and variants only at the base kind, and express the type constructor variants using type functions.This does not reflect a fundamental limitation in our embedding, but simply a choice made for expediency in development.
These definitions emulate our intuition of records and variants.For variants, we are quite close: erasing the types leaves a pair of a tag and a value, just as you might expect to represent a value of a traditional variant type.For records, we are further away: while we emulate accessing fields of a record by offset, the practical construction of records is not emulated by our encoding.Nevertheless, we hope that these encodings demonstrate the potential of a real implementation, even if they do not claim to address all the problems that such an implementation would encounter.
Note that these types have none of the properties we have assumed for the corresponding types in R : there are no traces of labels to be found, and order is very much significant in determining the meaning of rows, records, and variants.The mapping between rows in the source language and rows in Agda will be found in the concrete evidence for the row predicates, discussed next.

Containment and Combination
The next piece of our encoding is the evidence for the containment and combination predicates.The stratification of the entailment relation Γ m is entirely unsurprising.As usual in qualified types, evidence for predicates plays a central role in interpreting the overloaded operators.Pur goal is to combine the intuition of a practical realization of R with dependent types to ensure type safety.
Intuitively, containment maps indices in the smaller row to indices in the larger row.
(We have omitted some straightforward but tedious bookkeeping to do with levels.)The evidence for containment is a dependent function over indices in the smaller row, associating each with both an index in the larger row and a proof that the associated types are the same.Implementing record projection and variant injection in terms of this evidence is simple: the former simply precomposes with the evidence function while the latter replaces the existing tag with its image in the evidence function.
Similarly, combination maps indices in the resulting row to indices in one of the two starting rows: We pair the intuitive mapping on indices with evidence that types agree.As for containment, the implementation of the branching and concatenation operators in terms of this evidence is immediate.Unfortunately, however, this is not sufficient to implement all of the entailment rules of R .Our intuition is not just that this be any map between the indices, but a surjective map: every index in one of the original rows should appear somewhere in the combined row.This intuition justifies the entailment rules ( ⊙ ) and ( ⊙ ), which conclude containment from combination.However, this intuition is not captured in our evidence.We have taken a brute force approach to doing so, by storing the evidence for the two containments in the evidence for combination: This definition allows us to realize all of R 's entailment rules.
We define an intrinsically-kinded representation of predicates, interpreted as evidence: We define a corresponding intrinsically well-formed definition of predicate environments and entailment: Finally, we define the meaning of an entailment judgment in terms of the meaning of the predicate it entails: The latter provides a constructive proof of the following.
T .The entailment relation of R is sound.

Label-Generic Operations
The label-generic operators ana, syn, and fold work by invoking a suitably parametric function on entries in their source rows.To implement this, we must be able to work backwards from the index used in a variant or record to the corresponding evidence that its type is in the original row.
We capture this in Agda as follows.
We begin by introducing an abbreviation for indices over a given row.
The pick operator selects from a row the singleton row at a particular index, and we can construct evidence that each singleton row is contained within the original row.Similarly, the delete operator returns the row containing everything but the given index.We can also construct evidence that this row is contained within the original.Finally, for a given index into a row, we can produce the evidence needed to invoke the body of a label-generic operator: that combining the singleton row and that index and the remainder of the row gives the original row.
The implementations of the label-generic operators follow easily.

Terms and Equivalences
Finally, we come to the representations of terms, and of type equivalences.We use intrinsically kinded representations of type and predicate equivalence: (We will omit the level bookkeeping for the remainder of this section, as it is entirely routine.) We have made one important simplification in mechanizing the type equivalence relation.If we restrict type equivalence to kinds ★ i , then we have shown that the interpretation of equivalence derivations is an isomorphism in Agda.That is to say, if we have a derivation that 1 ≡ 2 , then (for a suitable type environment H ) we can show not only functions to : 1 H → 2 H and from : 2 H → 1 H , but also that their compositions are the identity function.In particular, we validate rule ( ), that singleton record and variant types are isomorphic to their underlying field type.
However, this definition of isomorphism is not applicable at higher kinds: type constructors have no elements, so it makes little sense to talk about mappings between them.Moreover, if we remove rule ( ), we are able to show stronger results, which generalize to all kinds: That is to say: we show that when equivalence is derivable between two predicates or two types (at any kind), their interpretations are propositionally equal in Agda.Given our limitations, these provide constructive proofs of the following claim.
T .The type and predicate equivalence relations of R are sound.
To account for the loss of ( ), our term language is extended with terms to construct and deconstruct singleton records and variants.We define intrinsically-typed representations of terms, and their interpretation: The latter provides a constructive proof of our final claim.
T .The type system of R is sound.

RELATED WORK
There is a significant and growing literature on row types and their applications, and a larger literature on extensible data types in general.We highlight that work that is most relevant to R .
Featherweight Ur.The most immediately relevant languages are Featherweight Ur and its practical realization in Ur/Web [Chlipala 2010[Chlipala , 2015a,b],b].As in R , Ur supports row and record concatenation with first-class labels, enabled by first-class label inequality proofs.As for R , Ur is based on System F , and supports mapping type-level operations over rows.Ur has practical evaluation as a framework for database-connected web applications.We view R and Ur as complementary explorations of the design space of extensible data types.
There are several differences in focus between R and Ur.Ur does not include extensible variants.Consequently, the duality of records and variants does not appear in Ur, and examples like our reify and reflect functions do not apply to Ur.We view extensible variants as an important application of row typing, useful for examples like the expression problem and encoding extensible effects; however, we do not think there is any fundamental reason that Ur's approach to extensible records could not be equally applicable to extensible variants.Ur also does not attempt to generalize over different row theories, but assumes that row disjointness is sufficient to capture extensibility.
The more significant difference between R and Ur is in our approach to generic programming with records.Ur provides a family of folding functions for concrete records types.Instead of our view, in which folds should respect the identities of the underlying row theory, Ur uses the type of its folder to capture the particular order in which the programmer intends to visit fields in the records.We believe that R 's synthesis operator provides a novel, alternative view of generic programming with records.In particular, we are able to define many of our operations to apply to records regardless of their structure; while we believe that Ur's folder could capture the same operation for any concrete record type, it is less clear that Ur captures them in the general case.
Other row type systems.Row types were originally proposed by Wand [1987] as a mechanism for typing records and variants; he defined rows by extension, one field at a time, and allowed subsequent extensions to overwrite fields already in a record.Rémy [1989] generalized Wand's approach in several significant ways.He restricts row extension to fields not already present in the row, enforced using kinds.His rows record both present and absent fields, with explicit operations to "forget" entries in rows.Finally, he introduces polymorphism over field presence, allowing his calculus to capture patterns like a single operation for both record extension and record update.Rémy's approach has been used as the foundation for numerous other row type systems.Blume et al. [2006] extends Rémy's approach to incorporate first-class blocks over extensible variants.Their implementation relies on the duality between records and variants, translating case blocks into records.However, this duality is not exposed to the programmer; unlike R , they rely on having a specific type for case blocks distinct from the normal function type.Other application of Rémy-style row type systems include: Makholm and Wells's 2005 system for mixin modules; Lindley and Cheney's 2012 type system for effect polymorphism; Hillerström and Lindley's 2016 system for extensible effects and handlers; and, Lindley and Morris's 2017 account of extensible session types.Gaster and Jones [1996] implement a system with operations similar to Rémy's, but using qualified types instead of kinds to assure that row extension is well-defined.Lindley et al. [2017] start from Rémy-style rows, but consider several extensions including generic support for renaming entries in rows.Berthomieu and le Moniès de Sagazan [1995] and Leijen [2005] independently proposed scoped rows, in which row extension preserves both the original and new fields.Wand [1991] identified the problems that can arise in typing record concatenation, and proposed an approach based on intersection types.Harper and Pierce [1991] support record concatenation using a new form of quantification, in which quantification is over types disjoint from a given row.Their system cannot express Wand's problem: while it can require that two rows be disjoint, it cannot require that a single field appear in their concatenation without requiring that it appear in a particular input row.
There have been numerous encodings of row types in other type system features, most notably Haskell's type classes and type families [Bahr 2014;Kiselyov et al. 2004;Morris 2015;Oliveira et al. 2015;Swierstra 2008].While impressive, these encodings inevitably rely on encoding rows as particular sequences as types, and so struggle to capture the flexibility that row typing is intended to provide.Extensible data types can also be expressed directly using intersection types and the merge operator [Dunfield 2012;Rioux et al. 2023].
R is differentiated from other row type theories by its focus on label-generic operations.It also inherits the expressiveness of R , and its adaptability to multiple different row theories.
Shallow embeddings in Agda.Our approach to mechanizing the metatheory of R is unusual; far more typical would have been to define an operational semantics of R directly, and then mechanize the expected properties of that operational semantics.We chose to embed the semantics of R directly in Agda for two reasons: we wanted an account that clearly did not rely on labels themselves, and we needed to rely on dependent typing to guarantee that record and variant operations were well-typed.This made it natural to embed our semantics in a dependent type theory, and Agda provides flexible dependently-typed programming and a rich standard library.
Embedding simply-typed -calculi in rich type theories is well-traveled ground.There is recent work on shallow or mixed deep and shallow embeddings of rich type theories in rich type theories [Kaposi et al. 2019;McBride 2010].Our embedding is less impressive than theirs: while we demonstrate that our notions of type equality and predicate entailment are sound, we still require explicit equality and entailment proofs in our derivations.

CONCLUSION
We have presented a novel approach to programming with extensible data types, based on labelgeneric operators for variant destruction and record construction and destruction.We conclude by identifying several directions of future work.
Relating row components.Lindley et al. [2017] proposes a renaming operator for rows, as a tool for simulating scoped rows with simple rows.We might hope to capture such an idea in R ; indeed, our kind system even includes rows of labels, which seem like a promising start.However, while we could attempt to describe a function that relabeled the fields of a row or constructors of a variant, we have no way to guarantee that the renamed fields are unique!That is, we have nothing that accepts the row {a ⊲ b, b ⊲ c} while rejecting the row {a ⊲ z, b ⊲ z}.More generally, we have no way to impose conditions on the relationship between an entry in a row and the remainder, other than that provided by the row combination predicate.
Realizing R .R 's goal is to demonstrate the expressiveness of its core features.We identify two challenges in making R more practical.The first is exposing its features in a programmer-friendly surface language, such as a variant of Haskell.Doing so would allow us to use R to capture practical examples from algebraic effects and handlers to extensible compiler passes.While adapting R to a type system without type-level functions would certainly make type reconstruction more likely, it may also introduce limitations in R 's expressiveness.The second is an efficient implementation of extensible records and variants, in particular, an account of record construction that does not require copying record values or leave records fragmented.

Fig. 3 .
Fig. 3. Witnessing the duality of records and variants map