Recursive Program Synthesis using Paramorphisms

We show that synthesizing recursive functional programs using a class of primitive recursive combinators is both simpler and solves more benchmarks from the literature than previously proposed approaches. Our method synthesizes paramorphisms, a class of programs that includes the most common recursive programming patterns on algebraic data types. The crux of our approach is to split the synthesis problem into two parts: a multi-hole template that fixes the recursive structure, and a search for non-recursive program fragments to fill the template holes.


INTRODUCTION
We consider the problem of synthesizing recursive programs from input-output examples.Following previous work, we consider functional programs over algebraic data types such as the natural numbers, lists, and trees [Kneuss et al. 2013;Lubin et al. 2020;Osera and Zdancewic 2015].For example, consider a program that appends two lists: This program uses general recursion, that is, the function append is explicitly recursively de ned, with calls to append within its de nition.Depending on what other language features are present, unrestricted general recursion is di cult to reason about; for example, proving termination of general recursive programs is normally non-trivial.
In practice many iterative/recursive programs, including append, can be expressed using more restricted primitive recursive constructs.The essence of primitive recursion is that the number of iterations or recursive invocations is known when the function is rst called.For example, the fold combinator captures a typical primitive recursive pattern where the number of recursive calls is the length of the list argument.A standard (general recursive) de nition of fold is: We can use fold to write a well-known alternative de nition of append: Fig. 1.The Para language that di er.Because we work on algebraic data types that have more structure than collections of machine words, we require a di erent measure of cost; see Section 3.1.2.We have implemented our synthesis algorithm in a tool Para, which we have applied to 59 benchmarks, including the benchmarks of [Lubin et al. 2020] collected from previous papers on synthesizing recursive programs and a number of new benchmarks we have written.In our experiments Para solves 55/59 benchmarks, while Smyth [Lubin et al. 2020], 2 and Trio [Lee and Cho 2023], three state-of-the-art recursive synthesis systems, solve 33/59, 31/54 and 41/59 benchmarks respectively ( ve benchmarks require higher-order input, which 2 does not support).All systems run in similar time for the problems they successfully solve, requiring fractions of a second to less than a minute on conventional hardware.See Section 5 for a more in-depth discussion of the experiments.We note that other systems are designed to handle synthesis of general recursive functions.However, none of the benchmark programs that appear in previous papers actually require general recursion-all previously reported results are on programs that are in fact primitive recursive.The fact that Para is able to solve almost all benchmarks even though those benchmarks were not originally intended to be primitive recursive shows that appropriately chosen primitive recursive forms are quite expressive.
To summarize, we make the following contributions: • We develop an approach to synthesis of primitive recursive programs over algebraic data types based on paramorphisms.• We decompose the synthesis problem into a set of multi-hole paramorphic templates and a stochastic search to ll the holes with non-recursive program fragments.• We show that this simple algorithm, with appropriately chosen templates, is able to solve all of the benchmarks from previous work plus a number of new and more challenging benchmarks.The rest of the paper is organized as follows.Section 2 introduces the simply-typed functional language we use as the target of synthesis.Section 3 de nes templates and our synthesis algorithm.Section 4 describes the implementation of Para, Section 5 presents the experimental results, Section 6 discusses related work, and Section 7 concludes.

PROGRAMMING LANGUAGE
Our target language Para, shown in Figure 1, is a simply-typed functional language with paramorphisms.A Para program consists of a list of type declarations typedecls and a program term .Type declarations in Para are standard, potentially recursive, algebraic data types [Milner et al. 1997].We will refer to the -th constructor of a type as .Without loss of generality, we require that the declaration of a type (the rst line of Figure 1) list all non-recursive arguments of type constructors (i.e., constructor arguments of types other than ) before any recursive arguments (i.e.constructor arguments of type ).This ordering simpli es notation in our algorithms for generating templates in Section 3.2.For example, the de nitions of the natural numbers and the booleans data Nat = Zero | Succ Nat data Bool = False | True have only constructors that take zero or one type arguments, so the ordering requirement imposes no constraint.Now consider lists of natural numbers and binary trees that store natural numbers at interior nodes: The standard terms of the language are variables, -abstractions, function applications and applications of constructors.para terms support primitive recursive computation via paramorphisms; for a detailed review of paramorphisms from a formal perspective, see [Meertens 1992].Section 1 gives an overview of paramorphisms and an example list computation.Figure 1 generalizes this example to an arbitrary algebraic datatype .The para combinator uses the locally de ned functions 1 • • • in a case analysis on the base functor datatype of .Each case takes a number of arguments equal to the arity of the constructor; the rst arguments (of types other than ) are the corresponding non-recursive constructor arguments, and the remaining arguments (corresponding to constructor arguments of type ) are pairs consisting of the original recursive constructor arguments and the result of a recursive call.Using pairs here allows us to have a single value corresponding to each constructor argument even though paramorphisms treat recursive constructor arguments of type itself di erently from other types.The monomorphic typing rules for Para are straightforward and given in Figure 2.
The evaluation rules for Para are given in Figure 3.The -reduction of -terms is standard.The rule for para terms applies when the rst argument to para is a constructed term with constructor .The reduction is carried out by selecting the corresponding th function supplied to para: substituting each actual constructor argument for in ′ .Furthermore, if the th constructor argument is recursive (i.e., of type ), the result of a recursive call to para on argument is substituted for ′ .
We conclude this section with two additional programs written in Para.The rst example uses a provided library multiplication function mul of type Nat → Nat → Nat to de ne factorial.Factorial is an example of a primitive recursive function that uses the expressiveness of paramorphisms-note that in the Succ case both the original constructor argument and the recursively computed value are used to compute the result: Factorial is one of our synthesis benchmarks.In our experiments, however, we synthesize the entire program, including multiplication and its required building block, addition, from scratch; see Section 5.
The second example maps a function over a tree of natural numbers.Note that this program does not use the full power of paramorphisms; the arguments in the Node case are not needed to rebuild the tree: We stratify Para into a non-recursive fragment Para and a template fragment Para .Nonrecursive terms are a syntactic subset of Para terms, and templates are multi-hole contexts that yield full Para terms when all holes are substituted with non-recursive terms.We denote the result of substituting a sequence of non-recursive terms into corresponding holes in a template by [ ].
The strati ed grammar only generates a subset of the full Para grammar, as it disallows occurrences of Para terms inside constructor terms.Nevertheless, it still preserves the full expressiveness of Para.Note that a Para program that has Para terms inside constructor terms can always be rewritten into one without such occurrences by hoisting the Para terms out of the constructor term and introducing a variable, e.g.
By stratifying Para's grammar, we can decompose synthesis problems into two subproblems: (1) Find a template, such that some sequence of non-recursive terms solves the synthesis problem when substituted into the template; (2) Given a template, nd a sequence of non-recursive terms that solves the synthesis problem when substituted into the template, if possible.
The intuition behind the strati cation of the grammar is that relatively few recursion patterns account for most of what is written in practical programming, and thus most programs of interest should be expressible by a relatively small set of templates.For algebraic data types in particular, paramorphisms capture common patterns of recursion and allow us to focus the synthesis procedure on the easier problem of nding non-recursive terms to ll the holes.
Our synthesis algorithm combines an outer loop that enumerates an expressive but relatively small set of templates and an inner loop that solves for non-recursive terms given a xed template.The inner loop problem is simpler than searching for a whole program directly because it need not invent recursion patterns.The synthesis algorithm does not terminate until a solution is found or the total time budget is exceeded.As mentioned in Section 1, we address the problem of nding the non-recursive terms by leveraging stochastic program synthesis techniques [Schkufza et al. 2013].We outline the synthesis algorithm in Figure 5.We write normalize(t) for the normalization of a term t under * →.The functions stochastic-synthesis and template-set are discussed in Sections 3.1 and 3.2, respectively.
where min is a non-recursive constructor of with the smallest arity.When multiple such constructors exist, we choose one arbitrarily.
Our synthesizer searches for non-recursive terms by beginning with a default candidate and randomly applying a set of rewrite rules ⇝ Γ parameterized by the typing context Γ for the hole: The Construction rule creates a new constructor application (by using the term in a fresh constructor of the same type where all other positions of the constructor are lled with default terms), the Projection rule replaces a constructor application with one of its (same-typed) arguments, and Variable replaces a term with a variable of the same type.We omit subscript Γ when it is clear from context.We can reach any well-typed non-recursive term starting from an arbitrary well-typed non-recursive term by rewriting under ⇝ Γ : Theorem 3.2.If Γ ⊢ 1 , 2 : , then there exists a rewrite sequence ∈ * ⇝ Γ such that ( 1 ) = 2 .
Proof.First, observe that one step of Construction rewrites any well-typed term to ( ) by substituting with min (this is a degenerate application of Construction where does not appear on right-hand side, because min must be non-recursive).This gives a rewrite sequence 1 such that 1 ( 1 ) = ( ).We then de ne a rewrite sequence 2 such that 2 ( ( )) = 2 by induction on a well-typed derivation of 2 .1) If 2 is a variable, then one step of Variable rewrites any term to 2 .2) If 2 is a constructor application of , for each argument ′ we de ne a rewriting sequence ′ such that ′ ( ( )) = ′ by induction.Then, one step of Construction rewrites ( ) to ( 1 ) • • • , and we then apply each rewriting sequence ′ to each argument sub-term, which rewrites The concatenation of 1 and 2 is the desired sequence .□ Thus, if some sequence of non-recursive terms 1 • • • solves a synthesis problem by substituting that sequence into the holes of the template , then it is always possible to obtain that solution by applying some ⇝ rewrite sequence to the default candidates ( 1 ) • • • ( ).
3.1.2Cost Function.Naive breadth-rst search of rewrite sequences gives us a semi-decision procedure for deciding whether there exists a sequence of non-recursive terms that solves a synthesis problem.However, as we show in our experiments, searching for a solution by enumerating all such possible rewrites is very expensive.Instead, we use a stochastic search procedure where the synthesis problem is recast as a cost minimization problem, and we apply ⇝ rewrites stochastically guided by the cost function.
Our cost function consists of two terms: size penalizes program size, and error penalizes incorrect outputs.A constant adjusts the weight of the two terms: The cost minimization procedure minimizes and reports any candidate sequence of nonrecursive terms encountered during the search such that error ( ) = 0. Given a large enough , solutions to Problem 3.1, if any exist within the subset of programs de ned by the template, are global minima of .In practice, setting = 1 usually allows us to visit good candidates with correct behavior and small sizes when searching near minima of .Both size(•) and error(•, •) are de ned as the minimal length of rewrite sequences satisfying some property.For size( ), the rewrite sequence must rewrite the default candidate term ( ) into current candidate term .For error( , ), the rewrite sequence must rewrite the program output ′ = normalize ( [ ] ) to the desired output : For our choice of ( ), the exact value of size(•) is computed by the following recursive procedure The exact value of error(•, •) is di cult to compute e ciently, so we approximate it using a tree edit-distance algorithm in our implementation.
To minimize , we employ a stochastic Monte Carlo search procedure inspired by Metropolis-Hastings sampling [Hastings 1970].Our sampling procedure works as follows: at each step we maintain a current candidate and its cost ( ) and create a modi ed candidate ′ , called the proposal, by uniformly sampling and applying one ⇝ rewrite among all possible ⇝ rewrites to any non-recursive term in .We then compute ( ′ ).The proposal is accepted, meaning it becomes the new current candidate, with the following probability, otherwise the current candidate is retained to the next step: where is a constant.The proposal is always accepted if its cost is lower than the current candidate, otherwise the proposal is accepted with a probability that depends on the di erence in cost between the proposal and the current candidate, with more expensive proposals having lower probability of acceptance.We repeat this procedure, proposing and possibly accepting one new candidate per step, until a solution is found or the computation budget is exhausted.
When the sampling procedure converges, more samples will be taken for which ( ) is small, giving us high probability of nding the global minima or su ciently good candidates.Even before the sampling converges, this approach e ectively hill climbs to nearby locally good solutions, but always has some probability of jumping to other parts of the search space.

Example.
To illustrate how stochastic rewrites guided by the cost function synthesize nonrecursive terms, we show several steps of an execution trace of the Para synthesizer synthesizing an append function for ListNat.
At the beginning of the search, each hole is initialized with its default candidate.In this example, every hole has type ListNat and is initialized to Nil: This initial function always returns Nil, regardless of the input.Figure 6 plots the cost of a search beginning with this program, labelled A in the gure.
The rst signi cant decrease in the cost function occurs after fewer than ten accepted proposals when the term in the second hole is rewritten from Nil to 2 , which can be done by one step of Variable: We now have a function that always returns its second list argument.This function has lower cost because the second list is more similar to the desired output (the concatenation of the two lists).The search discovers this function at point B in Figure 6.
The next signi cantly better function is not found until point C, after the 50th accepted proposal and close to the end of the search.This program iterates over 1 and adds a constant 0 to the head of the result list for every iteration: The function has lower cost because its output is even closer to the desired output, with the correct length but potentially incorrect elements in the nal list.The term in the rst hole is rewritten from Nil to 1 , which is achievable via one step of Variable.The term in the second hole has not changed.The term in the third hole is rewritten from Nil to Cons 0 , which can be achieved by rst rewriting to via one step of Variable and then to Cons 0 via one step of Construction.
Finally, one step of Variable rewrites the subterm 0 inside Cons 0 to ℎ which results in a correct append function, discovered at point D in Figure 6: We note that while only the Variable and Construction rewrites are strictly necessary for this example, the Projection rule is important in practice because it allows the search to undo decisions that end up leading nowhere.While there is a very short path from the initial program to a correct solution, the actual search trajectory explores many more programs, which is typical behavior for stochastic search.The long plateau in Figure 6 occurs because multiple changes must be present simultaneously for the cost to improve signi cantly from point B to point C, and even with a good cost function many trials are needed to discover the right combination of modi cations.

Proper Recursion Nests.
In this paper we focus on templates representing a single proper recursion nest of one or more calls to para, which we will show is expressive enough to solve our benchmarks.It is possible to extend the set of templates to include several sequentially chained proper recursion nests, but we do not consider this possibility in this paper.
We de ne a template generator proper-nest in Figure 7.Because proper-nest is a code generator, its function body is a mix of code to be executed as part of proper-nest itself and code that is the output of proper-nest.We color-code what is part of the output of proper-nest in blue; code in black is part of proper-nest's own logic.
All variables in the generated code are implicitly renamed to distinct fresh variables unless explicitly labeled by subscripts.All holes are implicitly given distinct labels.The function proper-nest generates a template term given a state type and a recursion type list.The state type is the return type of all para terms in the template.The recursion type list speci es the maximal depth of the recursion nest and the types on which para terms recurse: A para term at recursion nesting depth consumes a structure with the type speci ed by the -th element of the recursion type list.We use Haskell syntax (ℎ :: ) and [] to pattern match on the meta-level recursion type list, to distinguish it from object-level lists denoted by constructor syntax Cons ℎ and Nil.When we write patterns of the form (a) Algebraic data type as state type There are two cases for proper-nest, depending on state type argument .The case for when is an algebraic data type is de ned in Figure 7a.For para terms consuming a recursive type, proper-nest only generates para subterms for cases where the constructor has some recursive arguments.For para terms that consume non-recursive types (which degenerates into a pattern match), we generate para subterms for all cases.
The template for append given in Section 1 is generated by the call proper-nest ListNat [ListNat].As a more involved example, proper-nest ListNat [Boolean,TreeNat] generates the following template: It turns out that it is useful to allow state types to be not just algebraic data types but also functions.Function types can encode top-down recursion via continuation-passing style in addition to the bottom-up recursion expressed "natively" by paramorphisms.For example, one way to reverse a list is to traverse the list from head to tail and add the visited element to the head of the result list during traversal.This algorithm can be expressed with para by using NatList → NatList as the state type, even though para traverses the list from tail to head: Figure 7b de nes how proper-nest handles function types as state types using an auxiliary function proper-nest-func.The de nition of proper-nest-func is similar to the algebraic data type case of proper-nest.The base case is adjusted to keep the generated term syntactically well-formed.We wrap a pair of a -abstraction and an application around the term generated by proper-nest-func so that the input to the state (which is a function) can be varied.
3.2.2Library Functions and Tuples.It can be useful to supply a set of library functions to be used as additional primitives in the synthesis of non-recursive terms.We support rst-order library functions by amending production rules to the grammar of non-recursive terms and adding the corresponding typing rules and rewrite rules, similar to the approach described in [Alur et al. 2013].These terms representing library function applications can then be used in any holes in the template, including the data to be pattern-matched by the para combinators.For example, to include the library function + : Nat × Nat → Nat, we add a new production: and the corresponding typing rule: We add the following rewrite rules so that the search procedure can construct and remove nonrecursive terms representing addition: As another example, for the library function =: Nat × Nat → Bool, the following rewrite rule is added: where the default candidate (Nat) = Zero.Note that because the return type Bool does not occur in the argument types of the function =, there is no dedicated rewrite rule to remove terms of the form = while retaining some subterm(s).Instead, the search procedure can remove such terms by invoking the Construction rule, which rewrites any of them to the default candidate (Bool) = False.We also support using tuple types as state types.Tuple types can encode recursion over multiple state components by packaging them as a tuple.For a sequence of types 1 • • • , we generate the type declaration of their tuple type if used, denoted by 1 × • • • × , as follows: We then introduce the projections 1 • • • for this tuple type as library functions.Projection is: We generate a set of templates by applying the generator proper-nest to di erent plausible combinations of state type and recursion type list.Currently, we use the following strategy to generate the template set: where is a parameter specifying the recursion depth of the templates and { in } is the set of algebraic data types appearing in the type signatures of the target program and library functions.

Normalization
In many cases our synthesizer nds a solution using a template larger than what is strictly necessary.Such solutions typically contain trivial calls to para that do not perform any substantial recursive computation.To eliminate those trivial terms and improve the readability of the programs, we normalize the output of the synthesizer according to the rewriting semantics presented in Figure

IMPLEMENTATION
We have implemented our synthesis algorithm in a tool also called Para in approximately 700 lines of Common Lisp, excluding the experimental setup.Our prototype supports a prede ned set of polymorphic algebraic data types, including List , Tree and Tuple as well as Boolean and Nat.As mentioned previously, before performing synthesis all polymporphic types are resolved to concrete types-synthesis is done on terms with monomorphic types.
Compiling Templates.Because each template will likely be executed with many instantiations of non-recursive terms during the synthesis process, it is bene cial to compile a specialized interpreter for each template that executes its candidate non-recursive term sequences.We generate typeannotated Lisp source of a specialized interpreter given a template term and compile it to native code using the SBCL compiler [Rhodes 2008] with high optimization settings (optimize (speed 3)).Compilation results are cached and reused if the same template term is used in multiple problems.We run compilation of di erent templates in parallel when preparing the template set for a problem.
Restart Strategy.We have observed that the distribution of search times for a synthesis problem can be extremely wide.Since there are generally short search paths to a solution, a couple of good guesses early in a search can lead to rapid convergence to a solution, while a couple of bad initial guesses can result in searches taking orders of magnitude longer.As suggested in [Koenig et al. 2021], we periodically restart the search to improve the chances of a short successful search.We use a simple strategy that restarts the search when either of the following conditions are met: (1) After restart states are explored, but no solution is found.
(2) If the wall-clock time for the current search takes a factor of restart times longer than the average time of all previous searches.The rst search is initialized with a timeout of 0 .
restart , restart and 0 are xed parameters.The second condition also guards against search paths that contain programs of very high time complexity, which can consume signi cant computation budget while making little progress.
Parallel Search.Our synthesis technique is embarrassingly parallel.The search process for di erent templates can be run simultaneously, and any single template can be accelerated by running multiple stochastic searches in parallel with independent random number generators.Currently, our prototype supports single-node shared-memory parallelism via multi-threading.

EVALUATION
This section presents experiments designed to answer the following questions: (1) Is Para able to solve di cult synthesis problems with complex recursive patterns?
(2) How does Para's ability to solve synthesis problems compare with state-of-the-art Programby-Example systems?(3) How does Para's stochastic search procedure compare with enumeration-based search?(4) How does the ability of state-of-the-art Program-by-Example systems to utilize primitive recursive combinators and automatically-generated templates compare with Para?
Experiments used a 2017 iMac Pro with a 3.2 GHz 8-Core Intel Xeon W CPU and 32 GB of RAM.

Benchmarks and Setup
We have assembled a suite of 59 synthesis benchmarks for functional programs from previous Smyth benchmarks [Lubin et al. 2020], the Haskell Prelude library and the Haskell Data.List library [Marlow et al. 2010].Each benchmark problem is de ned by the expected monomorphic type of the top-level program, library functions which the synthesizer may use, and 10 sets of input-output examples, each of which has 16 input-output pairs.The input-output example sets are automatically generated by calling a reference implementation with randomly-sampled inputs.The inputs are generated by reinterpreting algebraic data type de nitions as probabilistic context-free grammars [Booth and Thompson 1973], with equal probabilities assigned to each constructor case.We discard any examples that exceed a maximum size (we use 50 nodes as the limit in our experiments).For higher-order problems, inputs of function type are instead uniformly sampled from a set of handwritten functions collected from the Smyth benchmarks.All four systems Para, Smyth, 2 and Trio support supplying library functions, and some of the benchmarks from the literature that we incorporate require library functions to be provided.For these benchmarks, we also include variants where no library functions are provided, which we expect to be a harder problem to solve.We show the list of provided library functions (if any) in parentheses after the name of a benchmark problem.
To answer research questions 1 and 2, we ran Para, Smyth and Trio on each input-output example set for all 59 benchmarks.We ran 2 on a subset of 54 of our benchmarks that excludes benchmarks with high-order inputs, because 2 does not support them.By default Para uses all 8 cores of the experiment machine and is given a 1-minute time limit for each synthesis task.Smyth and 2 do not support parallelism and are given an 8-minute computation budget on a single core for each synthesis task.To control for any bene ts of the parallel search and enable a more direct comparison to the sequential systems, we also run Para on a single core with the same settings and an 8-minute computation budget, with results labelled by Para (Serial). 2requires a component library to be provided and we use its standard component library for all runs of 2 .This standard library already contains the library functions required by some of our benchmark problems; for these problems, the variants with and without library function become the same for 2 , therefore we run 2 on such benchmarks only once and duplicate the resulting entry.
Synthesized programs are validated against an independently generated set of 20 random inputoutput test cases and then manually inspected for correctness.We report the number of correct synthesis results among 10 runs and the average time of successful runs in seconds.If no synthesis runs are successful, we report the most common reason for failure among the 10 runs: "time" for timeout, "mem" for out of memory, "wrong" for a result with di erent semantics from the reference implementation, and "fail" for other conditions.
We set parameters = 5.0, restart = 50000, restart = 1.5 and 0 = 1s for Para.We have veri ed that Para's performance is insensitive to small changes of these parameters.We varied the value of the recursion depth parameter = 1, 2, 3, and report the depth that yields the highest number of successful runs, with the smaller depth reported in the case of a tie.Because searches with di erent recursion depth can be run in parallel, performance in practice is determined by searches that are more likely to nd a solution; moreover, the required recursion depth to e ectively solve a synthesis problem serves as a quantitative measurement of the di culty of that problem.
An alternative to stochastic search is deterministic enumeration-based search, which can either be brute force (literally enumerate all programs as in [Bansal and Aiken 2006;Massalin 1987]) or guided by a cost function, where a current frontier of candidates are ranked and only the most promising are considered for further exploration ( 2 incorporates this strategy).To evaluate the e ectiveness of our stochastic search algorithm described in Section 3.1 compared to enumeration-based search, we implement a complete best-rst-search bottom-up synthesis procedure for Para guided by our cost function.We ran Para under the same settings using this deterministic enumeration procedure instead of the stochastic one on all of our benchmarks, with results labelled by Para (BFS).
To address question 4, we translate the templates Para used in successful solutions to the input format of Smyth, and ask Smyth to solve the same problem using that translated template.As 2 supports higher-order library procedures, we supply a library of paramorphic combinators to 2 with example-passing axioms and run 2 on the 54 benchmarks with 1) this paramorphism library, and 2) the standard component library and the paramorphism library.Trio's interface does not provide a way to perform this experiment.

Results
Detailed experimental results are in Tables 1 and 2 and cactus plots summarizing the running times of all problems solved for each system are in Figure 8.We normalize the CPU time in the cactus plot by multiplying wall-clock time by the number of cores used by each system.Para solved all but 4 out of the 59 benchmarks reliably, signi cantly outperforming Smyth and Trio, which did not solve 26 and 18 benchmarks, respectively.Among the 54 benchmarks that we ported to 2 , 2 solves 31 while Para solves 50.When running Para on a single core, the system solves all but 1 benchmark that it solves with 8 cores, with typically an order of magnitude longer time.This shows that our embarrassingly parallel algorithm can lead to super-linear speedup on multi-core systems.Among the more di cult problems for which we provided two variants with and without library functions, Para is able to synthesize complex recursion patterns without the aid of any library functions, while Smyth, 2 and Trio are only able to solve some of the problems when provided with library functions.For benchmarks that are solved by all systems, Smyth generally takes less time to nd a solution, while Trio solves more problems than any system other than Para.
When using deterministic best-rst-search enumeration, Para solves 38 of 59 benchmarks.While a more sophisticated deterministic search algorithm could potentially do better than Para, our enumeration-based search solves more problems than 2 , which is also enumeration-based-thus it is likely not easy to nd an enumeration-based search comparable to stochastic search.We also note that the stochastic search has the added bene t of requiring only O(1) space for bookkeeping.Figure 9 gives a partial explanation for the strong performance of stochastic search: The number of needed changes to the template to nd a solution is almost always less than 15.Previous work on synthesizing straight-line assembly programs has been successful up to 10-15 instructions [Massalin 1987;Schkufza et al. 2013], which requires roughly 50 changes to opcodes, registers and constants starting from an empty program.We likely need harder benchmarks to observe the limits of stochastic search in Para. Figure 10 gives a whisker plot showing min, max, and distribution by quartile of runtimes of Para for the problems it solves.The wide distribution and small minimum search times motivate the use of parallelism and restarts to bias Para towards nding short successful runs.Smyth did not solve any of the benchmarks using the automatically generated templates Para used in successful solutions.We note that Smyth implements the program sketching [Solar-Lezama 2013] synthesis methodology, which is a promising method to express high-level insights from the programmer and utilize them in synthesis process.However, it is not well-studied whether systematically generated, rather than programmer provided, sets of sketches enhances the problemsolving ability of such systems, and our experiments show no such evidence. 2, when provided with the paramorphism library, solves 30 out of the 54 benchmarks.When given both its default library functions and paramorphism combinators, 2 does not solve any additional benchmarks besides the ones it already solves using only the default library functions.We observed a signi cant negative correlation between the required recursion depth of the templates (see Section 3.2.3)for Para to e ectively solve a synthesis problem using stochastic search and the ability of other systems to solve such problems.To quantify this nding, we categorize the problems that Para solves in Table 1 by the required recursion depth for Para and plot the percentage of problems solved by each system in each category.The results are shown in Figure 11.For Para (BFS), Para (Serial), Smyth, 2 (paramorphism) and Trio, the percentage of problems solved decreases monotonically when the required recursion depth for Para increases (Trio very slightly improves at depth 3 over depth 2, but is essentially a tie).The negative correlation is less pronounced for 2 (stdlib), which can be attributed to the fact that the 2 standard library contains many recursive library functions, thus the actual recursion depth 2 needs to successfully synthesize a solution is shallower.In fact, we observed that did not synthesize any nested recursions when used with just its own standard library.We postulate that the following factors contribute to this observed negative correlation.First, the required recursion depth is a measure of a problem's di culty, so we expect any system to perform better on easy problems (low depth) versus hard problems (high depth).Second, during synthesis of a program with higher required recursion depth, more holes are present, which limits the e ectiveness of static analysis to propagate information. 2, for example, uses a deductive procedure to discover constraints on the solution, but too many holes can render the analysis ine ective.Third, Smyth synthesizes top-level general recursion, which in principle expresses a larger set of programs than the properly-nested paramorphisms synthesized by Para.However, this extra generality incurs a larger search space and potentially less e cient search algorithms, while Para's smaller and more structured search space contains solutions to all benchmarks.
As a nal experiment, we compare how example-e cient Para is compared to Smyth-how few examples are needed to synthesize a solution to a problem? Figure 10, column 1 of [Lubin et al. 2020] shows the minimal number of examples that Smyth needed to synthesize 34 problems.Using the same input examples, Para successfully synthesizes 24 of these problems.In fact Para found solutions for 32 problems that satis ed the input-output examples within the time limit, but 8 were not the intended program.With no bias towards anything other than short solutions and guided purely by observed input/output behavior, it is not surprising Para requires more examples than other systems to fully constrain the result.We have also shown an upper bound of 16 examples for all the problems in our benchmark suite that Para successfully solves.list-processing [Feng et al. 2018].Those approaches synthesize non-recursive programs that use a prede ned set of recursive primitives and are fundamentally more domain-speci c than our work.
PBE for Recursive Functional Programs.Two early systems, Escher [Albarghouthi et al. 2013] and Myth [Osera and Zdancewic 2015], synthesize recursive functional programs from input-output examples.Escher [Albarghouthi et al. 2013] is a bottom-up synthesis procedure for an untyped rstorder language with general recursion.Available data types are restricted to a prede ned set of base types.Myth [Osera and Zdancewic 2015] is a top-down deductive synthesis procedure for a typed higher-order language with user-de ned algebraic data types.It performs proof search to produce terms that satisfy a given set of input-output examples.Both Escher and Myth require input-output examples to be trace-complete, i.e. the input-output examples must include the input/output pairs of all recursive calls.Writing trace-complete examples is cumbersome and requires some intuition for the internal computations of the program to be synthesized.
Burst [Miltner et al. 2022] uses bottom-up synthesis that handles general recursion without requiring trace-completeness by using angelic execution, executing unknown recursive calls under angelic semantics during synthesis and then re ning the speci cation if the synthesized program is incorrect under normal semantics.The procedure repeats until a correct program is found.
All of the above-mentioned systems synthesize a single recursive function with recursive calls to the top-level function itself.In practice, however, solutions to many problems are more naturally expressed with mutually recursive functions.Even though a single general recursion at top-level is in theory capable of encoding arbitrary mutually recursive functions, currently such complex general recursive functions are not practically synthesized by these systems, which poses di culties in applying these approaches to more challenging problems.
PBE for data structure transformations. 2 [Feser et al. 2015] synthesizes functional programs that transform data structures using higher-order combinators such as fold, map and filter.Similar to templates used by Para, 2 generates multi-hole hypotheses using these combinators.It then infers the input-output examples for the holes via deductive reasoning. 2is notable for its ability to synthesize nested recursive transformations.In practice, we nd that 2 's approach is limited to deducing examples one-hole-at-a-time.When there are multiple holes in a hypothesis but no example adequately constrains any single hole, 2 falls back to exhaustive enumeration.There are also speci c requirements on the provided input-output examples for the deduction procedure to be e ective.For example, while 2 is generally able to propagate input-output examples of a map hypothesis into its functional argument, the same is possible for a fold hypothesis only if there are pairs of examples that di er by only appending or prepending one element to the input sequence.In our experiments using randomly-generated examples, we nd that 2 often only propagates only the base case example for fold-like combinators including fold and our paramorphism combinators.Para on the other hand can take advantage of simultaneous constraints on multiple holes by probabilistically making multiple changes, even if each of the changes individually does not decrease the cost.Para also uses fold-like combinators e ectively with fewer restrictions on the input-output examples.
Program Sketching.Sketching [Solar-Lezama 2013] is a synthesis methodology that synthesizes concrete implementations by completing program sketches, i.e. programmer-provided partial programs with holes.Sketching is a promising method to express high-level insights from the programmer and utilize them in synthesis process.We note that it is possible to provide such systems with a systematically generated set of sketches that describes di erent recursion shapes, in hope of improving their ability to solve more di cult problems.Synapse has applied this approach to a number of non-recursive synthesis problems [Bornholt et al. 2016].However, to our knowledge such an approach has not been well-studied for recursive programs, which are generally more di cult to synthesize.
Sketch [Solar-Lezama 2013] pioneered the sketching approach, synthesizing programs in an imperative, C-like language by translating input-output examples of the synthesis problem to a logical formula that constrains admissible hole values and relying on an external SAT or SMT solver to nd hole value assignments.Rosette develops a similar approach for the untyped functional language Racket [Torlak and Bodik 2013].This approach inherently limits the types of holes to those that external solvers handle well.In practice, only booleans and integers are supported.Syntrec [Inala et al. 2017] suggests that it is possible to encode user-de ned algebraic data types in such a framework.However, such an encoding is limited to some subset of all possible well-typed terms for any given hole, such as bounding the number of constructors by a pre-determined limit.
Smyth [Lubin et al. 2020] is a descendant of Myth with support for program sketching, allowing users to provide a custom template with multiple holes.Smyth also removes the trace-completeness requirement from Myth via live bidirectional evaluation, which propagates examples backward through partially evaluated incomplete programs.A key di erence between Smyth and Para is that Smyth is design to support general recursion while Para speci cally targets primitive recursion.As discussed in Section 5, the extra generality of Smyth is not exploited by any of the benchmarks in the literature while also likely making it more challenging for Smyth to solve the harder problems.
Leon [Kneuss et al. 2013] and Synqid [Polikarpova et al. 2016] complete program sketches from logical speci cations.Leon synthesizes Scala functions given pre-conditions and post-conditions by combining a term generation system, a veri er, and a conditional abduction algorithm to generate recursive program fragments.Synqid accepts logical speci cations in the form of re nement types of the target program.Synqid employs a Myth-like proof search procedure, relying on an external SMT solver to perform re nement type checking.These logical speci cation-based synthesizers can be applied to PBE tasks by encoding input-output examples as a conjunction of propositions.However, the underlying techniques are not necessarily tailored to speci cations of such structure.For example, Synqid requires the input speci cation to be inductive, which is similar to the trace-completeness requirement.See [Lubin et al. 2020] for experiments comparing the performance of Leon and Synqid applied to PBE tasks versus dedicated PBE synthesizers.
SyRup uses version space algebra to hypothesize pairs of recursive programs and execution traces, using the consistency of a program with a trace to guide progress [Yuan et al. 2023].In testing SyRup, we found that it solves the bool and nat benchmarks but only one list and one tree benchmark.SyRup relies on having some trace complete examples to do well; with only randomly generated examples (our scenario), [Yuan et al. 2023] showed that SyRup does not consistently outperform Smyth.Overall, the di erent goals of SyRup-minimizing the number of examples while requiring some trace completeness-makes a fair comparison with Para di cult.
Stochastic Search.Stoke [Schkufza et al. 2013] applies stochastic MCMC search to the superoptimization task for 64-bit x86 instruction sequences.Stoke synthesizes optimized versions of a given target program under some performance metric.E ective application of the approach is limited to loop-free programs.While our cost function is di erent, we also measure the di erences between the output of a candidate program and the desired output.
Genetic Programming (GP) [Koza 1994] stochastically evolves a program to improve its tness for a particular task.GP has been applied to evolve general recursive functional expressions that satisfy given input-output examples [Moraglio et al. 2012;Nishiguchi and Fujimoto 1998;Wong and Leung 1996].However, evolving non-trivial general recursive functions using GP has proven di cult in practice [Agapitos and Lucas 2006;Agapitos et al. 2017;Alexander and Zacher 2014].
In [Yu 2001;Yu and Clack 1998], typed programs are evolved in primitive-recursive forms for lists using -abstractions and a fold operator.In [Binard and Felty 2008] programs in System-F [Girard 1971[Girard , 1972;;Reynolds 1974], a total (always terminating) functional language with polymorphic types, are evolved using similar search techniques.System-F includes programs with much higher time complexity than primitive recursive functions.Experimental evaluation of the above approaches is limited and it is unclear how well these approaches apply to a broad class of programming problems.
In [Swan et al. 2019], GP-based stochastic synthesis of implicitly recursive programs utilizing recursion schemes is explored by xing a catamorphism [Meijer et al. 1991] template at top-level.Ant programming or random search is used to synthesize expressions for each constructor case in the template.Because only a single catamorphism combinator at top-level is used, we expect this approach to have similar issues as aforementioned systems with top-level only general recursion when applied to di cult problems.
Synthesis of Auxiliary Functions.[Eguchi et al. 2018] extends Synqid to synthesize functional programs with recursive auxiliary functions.Their approach employs top-level templates with auxiliary function holes.The re nement types of the holes are inferred given the top-level re nement type of the target program, and then Synqid is used to nd instantiations of the auxiliary functions.The templates use a restricted syntax, which is less exible in terms of where holes are allowed to appear compared to our template syntax.For example, the argument to a pattern matching construct is not allowed to contain holes.They employ two prede ned top-level templates, a fold-like (catamorphism) template and a divide-conquer-type template, rather than systematically generated templates.These templates are not nested so at most two levels of recursion are possible (one from the top-level template and one from auxiliary functions synthesized by Synqid).
Cypress [Itzhaky et al. 2021] synthesizes imperative heap-manipulating programs with auxiliary recursive procedures.Cypress extends a deductive synthesis framework with cyclic proofs and uses abduction during proof search to discover potential application of cyclic reasoning.Recursive auxiliary procedures naturally arise from cyclic derivations, with backlinks in the derivations corresponding to recursive calls.Because of the very di erent target programs and synthesis techniques, the relationship, if any, between Cypress and our work is unclear.

CONCLUSIONS AND LIMITATIONS
We have presented a new program synthesis technique for recursive functions over algebraic data types based on paramorphisms.By splitting the synthesis problem into selecting a skeleton of nested paramorphisms with holes and synthesizing non-recursive terms to ll the holes, we are able to reuse simple and e ective stochastic search techniques to synthesize complex recursive programs.We have shown by experiment that an implementation of our approach is able to synthesize all the problems handled by the current state of the art as well as substantially harder problems.
Our method is not without limitations.Primitive recursion, while well-matched to algebraic data types, is not as expressive as general recursion.Some programming patterns, such as worklist algorithms on graphs, would be awkward or even impossible to express with paramorphisms.We have shown that the benchmark examples used for recursive program synthesis in the literature are primitive recursive.Thus, it appears that program synthesis methods for problems that truly require general recursion have yet to be developed, and it is an open question whether practical methods for synthesizing general recursive programs exist for problems that actually require general recursion.
data ListNat = Nil | Cons Nat ListNat data TreeNat = Leaf | Node Nat TreeNat TreeNat In the de nition of the Cons constructor of ListNat the argument of type ListNat is last.Similarly, in the Node constructor of TreeNat the two recursive constructor arguments of type TreeNat are also listed last.We use the notation 1 • • • • • • for pattern matching, where 1 • • • should be understood to match the non-recursive constructor arguments of and • • • matches the recursive constructor arguments.For simplicity we consider only monomorphic algebraic data types.Our implementation supports polymorphic type declarations by instantiating them with monomorphic types during type checking.

Fig. 11 .
Fig. 11.Success rate by required recursion depth for Para

Table 2 .
Results with higher-order inputs