Synthesizing Efficient Memoization Algorithms

In this paper, we propose an automated approach to finding correct and efficient memoization algorithms from a given declarative specification. This problem has two major challenges: (i) a memoization algorithm is too large to be handled by conventional program synthesizers; (ii) we need to guarantee the efficiency of the memoization algorithm. To address this challenge, we structure the synthesis of memoization algorithms by introducing the local objective function and the memoization partition function and reduce the synthesis task to two smaller independent program synthesis tasks. Moreover, the number of distinct outputs of the function synthesized in the second synthesis task also decides the efficiency of the synthesized memoization algorithm, and we only need to minimize the number of different output values of the synthesized function. However, the generated synthesis task is still too complex for existing synthesizers. Thus, we propose a novel synthesis algorithm that combines the deductive and inductive methods to solve these tasks. To evaluate our algorithm, we collect 42 real-world benchmarks from Leetcode, the National Olympiad in Informatics in Provinces-Junior (a national-wide algorithmic programming contest in China), and previous approaches. Our approach successfully synhesizes 39/42 problems in a reasonable time, outperforming the baselines.


INTRODUCTION
Combinatorial Problems (CPs).CPs are an essential category of problems that concerns a discrete set of solutions [Schrijver 2003], such as the 01 Knapsack problem [Kellerer et al. 2004], or the longest common subsequence problem [Maier 1978].It has important applications in various domains.In general, CPs have three major types of problems: • (Decision) Finding any valid solution.
• (Optimization) Searching for the best valid solution.
• (Counting) Counting the number of valid solutions.
Difficulty of Solving CPs.These problems are generally difficult to solve because the number of solutions is tremendous.As a result, we cannot simply enumerate every solution and must design specialized algorithms to solve CPs efficiently.
Dynamic programming (DP) and Memoization.DP is a powerful and widely-used algorithmic approach to solving CPs efficiently [Aho and Hopcroft 1974;Cohen 1983;Cormen et al. 2009].To design a DP algorithm, we need first to discover hidden structures in the given CP, which requires human insights.Once we have discovered the hidden structures, we could implement DP in a top-down or bottom-up style, and the top-down approach is also termed memoization.Intuitively, it stores the results of previous invocations and reuses them when some cached invocation is called again.Typically, it exploits the structure of the costly enumeration algorithm and could be obtained by modifying the enumeration procedure.
Our result.Motivated by the importance of CPs and dynamic programming, we propose an automated approach to synthesizing efficient memoization algorithms in this paper.We implement our approach into the tool SynMem.SynMem requires the user to provide a high-level declarative specification of CPs.The user needs to encode the solution as variables and the validity condition of the solution as logic constraints.For optimization problems, the user needs to provide an additional objective function over variables.Such a specification is widely considered as natural for specifying CPs [Adelsberger 2003;Barták 1999].The output of SynMem is a program representing the synthesized memoization algorithm, which is ensured to be in pseudo-polynomial time.
Existing approaches.There are existing approaches for synthesizing DP algorithms.However, as we know, all existing approaches target a specialized subclass of dynamic programming algorithms and cannot solve many classic problems.For example, Pu et al. [2011]'s approach only supports DP algorithms that use a fixed number of scalar variables for memoization and thus cannot solve the classic 0-1 knapsack problem [Kellerer et al. 2004]; Lin et al. [2021]'s approach does not allow constraints over individual components of the solution, and thus cannot solve the classic longest increasing subsequence problem [Schensted 1961].More details can be found in Section 7.
Challenges.Hence, it is worth considering automatically synthesizing a memoization algorithm from declarative specifications that applies to various combinatorial problems.However, there are two major challenges.
• Scalability.A memoization algorithm is large and is beyond the scalability of existing general program synthesizers.• Efficiency.Most existing program synthesis approaches are designed for functional correctness and do not guarantee the efficiency of the generated program.
Key insights.We take two novel steps to address these challenges.
Step 1. First, we structure the synthesis of memoization algorithms by introducing two concepts, namely the local objective function and the memoization partition function.Our structuring incorporates a significantly broader class of memoization algorithms than previous works [Lin et al. 2021;Pu et al. 2011] and captures two fundamental properties of dynamic programming: the optimal substructures and the overlapping subproblems [Cormen et al. 2009].
By the structuring and a sequence of reductions, we reduce synthesizing the memoization algorithms to two independent tasks on inductive quantified relational program synthesis [Wang et al. 2018], namely, synthesizing a local objective function (LOF) and synthesizing a memoization partition function (MPF).Each task synthesizes a smaller program fragment.After solving both tasks, we obtain a complete memoization algorithm whose efficiency is controlled by the range of the MPF.The advantages of this step are two-fold.
• First, both tasks produced by this step target a smaller program fragment.Furthermore, the independence between the tasks enables us to solve them separately, yielding an efficient synthesis procedure.This addresses the scalability issue.• Second, it is hard to synthesize a large program with efficiency guarantees.We reduce controlling the efficiency of the whole memoization algorithm to minimize the range of the function synthesized in the second task.This address the efficiency issue.
Step 2. The specifications of the two synthesis tasks are too complex to be handled by the stateof-the-art solver on the relational program synthesis [Wang et al. 2018].Thus, we propose a new synthesis algorithm combining inductive and deductive methods.We also use a heuristic method to optimize the range of the program synthesized for the second task.The key insight of this step is two-fold.
• First, the inductive methods help simplify the specification only to contain basic operators, enabling a simple but effective deductive system.• Second, we apply the deductive methods to bypass the synthesis of a large proportion of unknown functions, reducing the original synthesis task a simpler task that conventional program synthesizers can solve.
Evaluation.To evaluate SynMem, we create a benchmark of 42 declarative specifications of CPs.
In detail, we consider the top 36 dynamic programming problems in Leetcode [lee [n. d.]] that appear most frequently in a real interview, all four dynamic programming problems in National Olympiad in Informatics in Provinces-Junior (a national-wide programming contest in China) [NOI [n. d.]] in the past ten years, and all benchmarks in the previous approach [Pu et al. 2011].SynMem successfully solves 39/42(92.8%)benchmarks in a reasonable time, outperforming the baselines [Pu et al. 2011;Solar-Lezama et al. 2006].
Contributions.To summarize, our work has the following contributions.
• We structure the synthesis of memoization algorithms by capturing their essences.Our structuring is more general than previous works [Lin et al. 2021;Pu et al. 2011].By a sequence of reductions, we obtain two independent synthesis tasks, which enable us to solve them separately.Furthermore, we can synthesize an efficient MA by minimizing the range of the function to be synthesized in the second task.• We propose dedicated algorithms to solve two synthesis tasks effectively.The key novelty of our algorithm is the mixture of inductive and deductive methods to take advantage of both types of approaches .• We create a benchmark of 42 declarative specifications of CPs and evaluate SynMem.The evaluation results show that SynMem could effectively find efficient memoization algorithms compared to baseline synthesis approaches [Pu et al. 2011;Solar-Lezama et al. 2006].Due to the space limits, we relegate the full version to the author's webpage [ful [n. d.]].

OVERVIEW
This part illustrates SynMem via running examples.We provide formal treatments in Sections 4 and 5. Below we first illustrate the specification and the synthesis goal in Section 2.1.Then, we overview SynMem using the classic 0-1 knapsack problem in Sections 2.2-2.6.

The Specification and The Synthesis Goal
Specification.SynMem accepts purely symbolic declarative specifications of CPs as input, usually written in a constraint modeling language such as MiniZinc [Nethercote et al. 2007].In these modeling languages, the user only needs to specify the intention of the problem instead of the detailed algorithm for solving the problem.A typical CP specification consists of four parts.Below, we introduce the four parts with a classic optimization problem, the 0-1 Knapsack Problem (KP), as shown in Figure 1 * .An instance of KP consists of a knapsack with capacity  and a set of items.Each item has its weight and value.The goal is to choose a subset of items with the maximum total value, such that the total weight of selected items is no more than the capacity .
1. Inputs.The first part specifies the input parameters of the problem.Assigning different values to the parameters produces different instances of the problem.For KP, we use n for the number of items and two arrays, weight and value, for the weight and the value of the items, respectively.Finally, we use C for the knapsack capacity, i.e., the total weight of the items cannot exceed C.

Solutions.
The second part specifies the solution space.In SynMem, a solution consists of several arrays of atomic variables, and the lengths of the arrays and the domain of each atomic variable are finite and depend on the input parameters.In addition, the domain of each atomic variable is a bounded interval of integers.For simplicity, we shall first consider a special case where only one array exists, and discuss how to handle multiple heterogeneous arrays later.In KP, the solution includes a single array p where each element p[i] is an atomic variable over 0..1, i.e., a Boolean variable.Variable p[i] represents whether to put the -th item into the knapsack.

Constraints.
The third part specifies the constraints over the solution space, defining the validity of a solution.In KP, we require the total weight of the chosen items to be no larger than C.

Objective.
For an optimization problem, the fourth part starts with solve maximize and specifies an objective function that assigns an objective value to each solution.In KP, the objective function returns the total value of chosen items.Given the four parts, the goal of the optimization problem is to find the maximum objective value that any solution might have, and returns the value.A decision problem or a counting problem is specified by supplying statement solve satisfy; or count satisfy; in the fourth part, where the goal is to find any valid solution or to count the number of valid solutions.
Synthesis goal.SynMem aims to synthesize a correct and efficient memoization algorithm (MA) for the given CP.For an optimal problem, the synthesized MA takes the concrete parameters of the problem as input, produces the optimal value of the objective function for this concrete problem instance as output.The algorithm should efficiently obtain the correct result for every concrete instance.The synthesis result for KP is presented in Figure 6.

Structuring MAs
This part gives our structuring of MAs.An MA caches and reuses the results of solved subproblems.To design MAs, we need first to define the notion of subproblems.Subproblems.Let us first consider using exhaustive search to solve this problem, as shown in Figure 2. The search algorithm recursively enumerates the values of p[1], . . ., p [n].By enumerating every valid solution, it obtains the maximum objective value.In the program, we use a structure atom_var_info to store the meta information of the atomic variables, and p_info [i].dom stores the domain of variable p[i], which is 0..1 in this case.
The main procedure Search recursively solves a subproblem of the original combinatorial problem, and the definition of a subproblem is shown in Figure 3.A subproblem is a more generalized version of the original combinatorial problem.Compared with the original problem, a subproblem has an additional parameter i which separates the original solution array p[1:n] into two parts: (i) the prefix p • [1:i-1] for the enumerated atomic variables, which is in the input part; (ii) the suffix p • [i:n] for the unknown variables to be explored, which is in the solution part.Note that when i = 1, the subproblem coincides with the original combinatorial problem.
Figure 4 shows how the subproblems are recursively solved by the exhaustive search using an example.Each non-leaf subproblem generates two child subproblems by considering different assignments to p[i], and their maximum objective value is returned.Each leaf problem corresponds to one solution, and returns either the objective value if the solution is valid or returns −∞.Gaps for memoization.The efficiency bottleneck of the exhaustive search is that the number of subproblems is 2  .An MA accelerates this process by caching the results of all solved subproblems, and reusing a cached result if an equivalent subproblem is encountered.Typically, two subproblems are considered equivalent if they have: • (M1) the same suffix of unknown variables, • (M2) equivalent objective functions, and where two objective functions are considered equivalent if they map the same solution to the same objective value, and two constraints are considered equivalent if they define the same set of valid solutions.
To form an efficient MA, we need to ensure (1) many pairs of equivalent subproblems exist, (2) the equivalence between subproblems can be efficiently determined without solving the subproblem.Now we analyze the three conditions above.Firstly, consider the condition (M1).Note that the number of subproblems is 2  , but the number of suffixes is .Thus, many pairs must have the same suffix.Also, the condition (M1) is easy to check: any pair of subproblems with the same  fulfill this condition.However, the remaining two conditions reflect two gaps in reusing the subproblems in the exhaustive search.
(G1) The objective function for each subproblem depends on the enumerated variables.Since different subproblems have different assignments to the enumerated variables, the objective functions of the two subproblems are unlikely to be equivalent.(G2) Though many pairs of subproblems satisfy the third condition (M3), since any two subproblems with the same remaining knapsack capability have equivalent constraints, and it is usually assumed that the knapsack capacity  is much smaller than 2  , it is costly to check whether the constraints of two subproblems are equivalent or not.
To alleviate these gaps, our approach introduces two novel concepts, namely local objective function (LOF) and memoization partition function (MPF).The local objective function replaces the original objective function in the subproblems, and does not depend on the enumerated variable.The memoization partition function allows us to easily check the equivalence of constraints between subproblems.Below we describe the two functions in details.
Local Objective Function (LOF).To overcome (G1), we need to make the objective function of the subproblems independent of enumerated variables.SynMem introduces the LOF, which does not depend on the enumerated variables, and replaces the original objective function with the LOF.For any subproblem, the ranking of each solution should not change by changing the objective function to LOF.In addition, the LOF should return the same value as the original objective function on the subproblem where  = 1 (i.e., the original problem).
Since LOF does not depend on enumerated variables, any subproblems with the same  have equivalent LOFs.Thus, by replacing the objective function with the LOF, the condition (M2) for equivalent subproblems can be implied by (M1), enabling a higher chance to reuse and an efficient way to check equivalence.
For KP, a LOF can be the second half of the objective function in Figure 3, as follows.
Since the search program recursively solves a subproblem, we need to derive two components from LOF to adapt the search program to return the local objective values.
• An initial local objective value L leaf for the leaf tasks whose corresponding solution is valid.
For KP, the initial value is 0, as there is no unknown variable.• An update function L upd for calculating the local objective value of a solution in a parent subproblem from that of a solution in a child subproblem.For KP, L upd can be L + p where i is the input parameter to the parent subproblem and L is the optimal LOF value of the child subproblem.
Memoization Partition Function (MPF).To overcome (G2), we need to quickly check the equivalence of the constraints between two subproblems (the condition (M3) above).SynMem introduces the MPF, which is a function over the set of enumerated variables and the input of the original specification.If MPF maps two subproblems with the same parameter i to the same value, they share equivalent constraints, which enables efficient checking.SynMem considers MPFs whose output is a tuple of scalars, which is common for MAs.Considering more versatile MPFs is left for future work.For KP, an MPF can be the total weight of currently selected items, as follows.
−1 =1 weight[j] • p • [j] However, calculating the MPF is still costly, as we need to scan over all enumerated variables.To further optimize the algorithm, we calculate the MPF incrementally along a sequence of recursions.Similar to the case of LOF, we derive two component from MPF.
• An initial MPF value M init for the first subproblem where  = 1.For KP, the initial value is 0, as there is no known variable.• An update function M upd for calculating the MPF value of a child subproblem from the MPF value of a parent subproblem.For KP, M upd is defined as where i is the input parameter to the child subproblem, and  is the MPF value of a parent subproblem.
The Memoization Algorithm.Based on the LOF and MPF, we can implement an MA for KP.
Figure 6 shows a program synthesized by SynMem.For presentation purposes, the program is given as the synthesis result of a sketch.This program follows the same recursive search process as the exhaustive search and introduces a map mem to store the result of existing subproblems (Line 3).For non-leaf subproblems, the algorithm first queries mem and returns if a cached result of an equivalent problem is found (Lines 12-13).Map mem takes the input parameter i and the MPF value M as the key, which corresponds to the conditions (M1)-(M2) (Note that by the LOF, (M2) is implied by (M1)) and (M3), respectively.The search procedure returns the maximum local objective value rather than the original objective value and updates the LOF and the MPF values incrementally using M upd and L upd .
Figure 5 shows how the MA executes on the same example as Figure 4.With LOF and MPF, many subproblems can be identified as equivalent.Note that in each equivalence class of subproblems, we only need to search for the result of one subproblem.We can reuse this result for other subproblems in this class.For MAs, the number of equivalence classes (and thus the number of subproblems to be searched for) is bounded by the domain of the mem map, which is a pseudo-polynomial Fig. 6.The generated search sketch and the synthesis result for KP.The box in the upper right corner lists the synthesized expressions for Lleaf , Minit , Lupd , Mupd .We can plug these expressions into the sketch to obtain the complete synthesized MA for KP.

𝑂 (n • n i=1 weight[i]
).Furthermore, L upd and M upd are executed in  (1) time, ensuring a single execution of Search is also efficient.Thus, the MA has the pseudo-polynomial complexity and is significantly more efficient than the exhaustive search with the exponential time complexity.
In the end, we remark the connection between our structuring and two fundamental properties required for dynamic programming.
• Optimal substructures requires that a subpart of an optimal solution is still optimal for the subproblem concerning this subpart.In other words, a subproblem should be independent of the enumerated variables.The LOF and MPF jointly ensure this property.LOF removes this dependency in the objective function.MPF ensures that two subproblems with the same parameter i and the same MPF value have equivalent constraints.Thus, the enumerated variables are not needed in checking the validity of a solution when the MPF value is available.• Overlapping subproblems requires that many equivalent subproblems exist.The more subproblems are overlapped, the smaller the number of equivalent classes of subproblems is, and the higher the efficiency of the dynamic programming algorithm is.As discussed above, this number is fully determined by the number of atomic variables and the range of the MPF.

Overview of The Synthesis Procedure
The synthesis of MAs is non-trivial.As mentioned in the introduction (Section 1), there are two major challenges, scalability and efficiency.An MA is usually a large program with complex loops or recursive calls and is beyond the reach of existing general-purpose program synthesizers [Ji et al. 2021;Lee and Cho 2023;Miltner et al. 2022].Furthermore, standard synthesis framework such as SyGuS does not require efficiency, and it is not easy to ensure the efficiency of the synthesized MAs.
To address the scalability challenge, we follow our structuring as follows.Synthesizing an MA can be seen as filling a search sketch (Figure 6) that follows an exhaustive search.By generating the search sketch, the scale of the remaining components (the box in the upper right corner in Figure 6) is much smaller.Thus, SynMem first generates the search sketch.Furthermore, SynMem generates the exhaustive search program as the reference implementation to verify the correctness of the synthesized MA. (Section 2.4) To complete the search sketch, we note that the existing sketch solver [Solar-Lezama et al. 2006] does not scale to synthesize MAs (Section 6).Thus, we further follow our structuring as follows.
The key to designing an MA is to find two functions, LOF and MPF, which capture different and independent aspects of the MA.As a result, SynMem converts the sketch synthesis problem into two independent problems of synthesizing LOF and MPF.Furthermore, the smaller the range of the MPF is, the more efficient an MA is.Thus, SynMem synthesizes an efficient MA by minimizing the range.After synthesizing the LOF and the MPF, SynMem obtains the corresponding initial values and the updating functions to fill the sketch.(Section 2.5) Finally, the specifications for the two synthesis problems are relational and are too complex to be solved by existing approaches.Thus, we further introduce a new synthesis algorithm that combines inductive and deductive methods.The inductive method removes complex higher-order operators, facilitating deductive analysis.The deductive method applies the term rewriting to bypass the synthesis of a large proportion of unknown functions.We also introduce a heuristic to control the range of the synthesized MPF to ensure efficiency.(Section 2.6).

Generating the Search Sketch and the Exhaustive Search Program
Generating Process.We generate the exhaustive search and its sketch following a template-based method, where most of the code is pre-written and only a few components are generated to adapt to different combinatorial problems.We have seen the search sketch for KP in Figure 6.We can see that most of the search sketch is fixed and can be pre-written, and we only need to generate a few components, all of which can be easily deduced from the specification.Specifically, we need to (i) deduce the size of the solution space and generate the conditional expression in line 8, (ii) generate the conditional expression in line 9 by copying the constraints, (iii) generate line 1 by copying the inputs, (iv) generate the procedure for reading the inputs (line 21), and (v) generate the procedure for initializing p_info.The exhaustive search program in Figure 2 can be generated similarly.
Multiple Arrays and Enumeration Orders.The search sketch above only covers the case where there is a single array in the solution part.Below, we illustrate the ideas for extending the search sketch into the case where the solution part contains multiple arrays.In this case, we need to enumerate the atomic variables in all arrays.Therefore, we can treat the solution space generically as a sequence of atomic variables, which is formed by concatenating all the arrays, and the domain of each variable is recorded in p_info.
Since the order of enumerating different arrays may affect the synthesis result, SynMem considers the permutations of all arrays, and generates a search sketch for each permutation.Furthermore, sometimes the specification contains multiple arrays of the same length, which are expected to be enumerated together.SynMem also tries zipping the arrays of the same length and enumerates all atomic variables in the same tuple of the zipped array simultaneously.

Decomposing into Smaller Synthesis Problems
After generating the exhaustive search algorithm and the sketch, we aim to complete the search sketch so that the complete program is efficient and equivalent to the exhaustive search.Such a synthesis from reference implementation problem [Farzan et al. 2022;Lee and Cho 2023;Miltner et al. 2022] is often addressed by the counter-example guided inductive synthesis (CEGIS) framework.
CEGIS.CEGIS is performed in iterations.In each iteration, we synthesize an MA that obtains the same output as the exhaustive search on a finite set of problem instances.Then, a verifier verifies whether the synthesized MA is correct.If not, the verifier returns a counter-example, we add the counter-example to the set of problem instances, and start a new iteration.
Our approach is general and can be used with any verifier.Our current implementation uses bounded testing, and exploring more sophisticated verification techniques [Badihi et al. 2020;Churchill et al. 2019] is an orthogonal problem left for future work.
In each CEGIS iteration, the task is to complete the search sketch over a set of instances.As discussed before, this problem is too complex for existing sketch synthesizers, and we first decompose it into independent tasks of synthesizing LOF and MPF.
Specification for the LOF .Though only the initial values and the updating functions are needed in filling the sketch, we synthesize LOF first and then the two components from LOF to reduce the difficulty of synthesis.Recall that the function LOF(i, p[i : n]) needs to satisfy two conditions: 1) returning the same value as the original objective function on the first subproblem, and 2) retaining the order of the solutions.Thus, we impose the following condition (C1) for every instance (n, C, weight, value) considered in the current CEGIS iteration, and every 1 ≤ i ≤ n.Intuitively, this condition requires that the original objective value can be obtained from the local objective value and the enumerated variables.
We further require that ⊕  is monotonically increasing with respect to the second parameter, i.e.,  ⊕   increases as  increases, and ⊕ 1 should directly return its second parameter, i.e., [] ⊕ 1  = .Note that (C1) is equivalent to the two conditions of LOF, and we are not intended to synthesize ⊕  , which is a ghost function variable and will be eliminated by a deductive approach later.After we have LOF, we further impose conditions (C2)-(C3) to obtain L upd and L leaf .
Specification for the MPF .Similar to LOF, we first synthesize MPF and then its initial value and updating function.Recall that if MPF(i, p[1 : i − 1]) returns the same value on two subproblems, the two subproblems have the equivalent constraints.To model this property into a program synthesis task, we can equivalently rephrase this property as follows.Note that this property holds if and only if the constraint in the given CP can be equivalently transformed into a predicate that depends on the result of the MPF but not the enumerated variables for validity checking.Therefore, we impose the following condition (D1) for every instance (n, C, weight, value) considered in the current CEGIS iteration, and every Similarly, we are not intended to synthesize the ghost function variable ⊙  .Furthermore, to find an efficient MA, we need to minimize the range of the MPF.We further impose conditions (D2)-(D3) to obtain M upd and M init .
In the end, we remark that n, C, weight and value are visible to all functions to be synthesized above, but we omit this dependency for conciseness.

Solving the Synthesis Problems
Both synthesis problems for LOF and MPF are relational [Wang et al. 2018].Since the number of unknown functions (⊕  in (C1), ⊙  in (D1) for each 1 ≤  ≤ , the LOF and the MPF) are usually large, these two synthesis problems are beyond the reach of the previous approach on relational program synthesis [Wang et al. 2018].
Hence, we propose a novel synthesis algorithm for the two tasks.Our algorithm applies the deductive term rewriting [Willsey et al. 2021] to bypass the synthesis of ⊙  's (⊕  's) and reduce each task into a conventional SyGuS task.Below, we illustrate how to solve (C1)-(C3) and complete the holes Lleaf and Lupd in a single CEGIS iteration over a single instance  0 presented in Figure 4.The procedure for (D1)-( D3) is similar, which we sketch at the end of this section.
Inductive Instantiation.We first address (C1) to obtain LOF.This specification involves ghost variables ⊕  and a complex higher-order operator Σ.We first remove Σ to facilitate the deductive transformation in the next step, which removes ⊕  .Since we only need to synthesize the LOF over a set of concrete problem instances in a CEGIS iteration, we can expand the Σ operator for each problem instance.We plug in  0 into (C1) and obtain the following.
Here all ghost function variables ⊕ 2 , ⊕ 3 are monotone with respect to the second argument.
Deductive term rewriting.To remove the ghost variables ⊕  , SynMem integrates a deductive term rewriting system.It systematically rewrites the LHS of ( 1)-( 3) into the equivalent forms, implicitly exploring different candidates of ⊕  .Concretely, consider the condition (2) above first.By rewriting the LHS of (2) as ), we can deduce that, once we find a LOF such that ∀p[1 we can choose ⊕ 2 as 3 • p[1] + LOF(3, p[2 : 3]) to establish (2).Similarly, we can apply rewriting and choose ⊕ 3 as . Note that ⊕ 2 and ⊕ 3 only involve primitive operators + and •.Thus, we could trivially check the monotonicity since  +  increases as  increases.In Figure 7, we present a possible resulting synthesis task for LOF after we have rewritten (1)-(3).Reduction to SyGuS. Figure 7 shows a conventional SyGuS specification [Alur et al. 2018].In a SyGuS problem, we are given a domain-specific language (DSL) representing the whole program space and need to find a correct program on all inputs.Since the LOF admits an updating function, it must be a program in the structural recursion form.Thus, SynMem synthesizes the LOF under a language that includes the compositions of the typical list structural recursion operators (e.g., map, filter, sum, etc.) We use the basic bottom-up enumerative method to solve this task, synthesizing After synthesizing the LOF, conditions (C2) and (C3) are conventional SyGuS tasks.Thus, we assume an external SyGuS solver to synthesize of the updating function.In our implementation, we carefully restrict our DSL such that for any synthesized LOF, we can use syntactic transformations to derive its updating function.In addition, the derived updating function is guaranteed to be  (1).For KP, we obtain that L upd (i, L, p and its initial value, L leaf = 0, completing the holes Lleaf and Lupd . Synthesizing the MPF .The correctness condition (D1)-(D3) for the MPF is very close to (C1)-(C3).Hence, we follow the same procedure to synthesize MPF and derive its initial value and the updating function.However, there are two differences.
First, we need to synthesize functions that output a tuple of scalars.We will enumerate the number of tuples in the output and synthesize the function for each component in the tuple.We reformalize this problem as a hitting set problem and apply the pruning on the max-degree bound [Bläsius et al. 2022] to efficiently solve this problem.
Second, we need to find an MPF with a minimized output range.We observe that applying a structural recursion function (e.g., sum, max) shrinks the range of the result.Thus, the larger the synthesized program, the smaller ranges the MPF would have.Thus, we follow the heuristics that enumerates the programs from large to small for the SyGuS problem.Applying the above procedure to KP, we obtain M(, p[1 and M init = 0, completing the holes Minit and Mupd . In the end, we remark that though conditions (D1) and (D2) are syntactically similar, the synthesis of (D1) is a relational program synthesis task, but the synthesis of (D2) is a classic SyGuS task.Thus, SynMem treats differently on these two conditions.Furthermore, note that SynMem is straightforwardly sound due to our insights into MAs and is complete (the reduction from Figure 6 to Figure 7 never excludes valid programs) if the set of equivalent expressions is recursively enumerable (Section 5.2).

FORMALIZING THE SYNTHESIS TASK
Representing CPs.The essential attributes of our specification have been illustrated in Section 2. Below we present the language features in more detail.Please refer to the full version of this paper for a complete illustration.A typical CP specification consists of four parts as follows.
1. Inputs.This part consists of the parameters of the problem.Each parameter is specified as an atomic value or an array of atomic values.The array length may depend on other parameters.The domain of each atomic value is a bounded interval l..r of integers, whose endpoints l and r can also depend on other parameters.For simplicity, we only consider 1-dimensional arrays.Other data structures, such as -dimensional arrays or lists, can be converted to 1-dimensional arrays and fit into our framework.
2. Solution.The solution part consists of  arrays sol 1 , . . ., sol k of atomic variables.The array sol i has the index p i ..q i and the bounded interval l i ..r i as the domain for atomic variables, where p i , q i , l i and r i only depend on the parameters.Still, we only consider 1-dimensional arrays.
3. Constraints.This part specifies the constraints in the solution space and defines the validity of a solution.When defining a constraint, SynMem supports common logical connectives (e.g., ∧, ∨, ¬, <, >, =, etc.), common arithmetic operators (e.g., +, −, ×, max, min, etc.).SynMem also supports accumulators in the form ACC(v in l..r)(f(v)), where ACC ∈ {sum, min, max, . . .}.It iterates the fresh variable v from l..r and accumulates the results f(v).Furthermore, SynMem supports for-loops to forall(v in l..r)(expr(v)) to construct a list of constraints Here, we also restrict that the range l..r only depends on the parameters.
4. Objective.This part specifies the type of the given CP, which has been illustrated in Section 2.1.Below, we define core concepts for the synthesis task.
Problem instances.Given the specification of a CP, the problem instance  is the quadruple ⟨, V, , ⟩ where: •  is the assignment to all parameters in the input part, where all array lengths are consistent with the value of other parameters, and all atomic values are within their domains.• V is the set of atomic variables for the problem instance.Given the parameters   , for every array sol i in the solution part, its indices p i ..q i and the domain of each atomic variable l i ..r i are fixed.Concretely, in this array, each atomic variable sol i [j] (p i () ≤ j ≤ q i ()) has the domain (l i ()..r i ()).V collects the name and the domain of these variables.•  is the set of constraints for the problem instance.This is defined as substituting the original constraint with .We expand the for-loop forall(v in l..r)(expr(v)) and add every expr(v) (l() ≤ v ≤ r()) to . •  is the objective for the problem instance.For COPs, it is defined as substituting the original objective function with .For CCPs and CDPs, it is a single value indicating the CP type.
In the rest of the paper, we omit the subscript  for simplicity if no confusion would be caused.
Assignments.Given a specification and its problem instance  = ⟨, V, , ⟩, the assignment V • is a (partial) map over a subset of atomic variables. • maps each atomic variable  in this subset to a concrete value in the domain of .The V • is termed as total if it is a total map from V → Int, meaning we have fixed the values for all atomic variables.We use V tot to represent a total assignment.Furthermore, for every assignment V • , we use  (V • ) ( (V • ), resp.) to represent the constraints (the objective, resp.) by further substituting the original one with the assignment Note that for total assignments V tot ,  (V tot ) is either True or False, and  (V tot ) is the objective value of V tot for COPs.
The synthesis goal.Given the specification, SynMem aims to synthesize a program  such that for every parameter , the program  reads  and obtains the problem instance  = ⟨,  , , ⟩, and outputs the correct value  ( ) such that.
• For combinatorial optimization problems (COPs),  ( ) is the maximum objective value of a valid total assignment, i.e.,  ( ) = max V tot { (V tot ) |  (V tot ) = True}.• For combinatorial decision problems (CDPs),  ( ) is whether there exists a valid solution, i.e., Specifically, SynMem aims to synthesize a MA whose structure is defined in Section 4.

THE STRUCTURE OF THE MEMOIZATION ALGORITHMS
In this part, we illustrate the structures of the MAs.We have illustrated the main ideas in Section 2.2.Below, we provide the formal treatment.In our structuring, the MAs are built upon the exhaustive search algorithms.It enumerates all atomic variables from the given enumeration order.Thus, we must first formally define the enumeration order as follows.
Enumeration order.The enumeration order specifies the order of the atomic variables for every problem instance.Different orders yield different MAs.Thus, SynMem tries all enumeration orders in the prescribed space, whose syntax and semantics are presented in Figure 8.The space consists of all permutations of the arrays in the solution part.It also supports enumerating several arrays with the same length simultaneously.Note that zip can have only one parameter, which means the array itself.Within each array sol[p i ..q i ], the program enumerates from sol[p i ] to sol[q i ].
Given the problem instance  , the semantics  is a list, where the -th element stores the name and the domain for the set of atomic variables enumerated at the -th step.
E ::= zip(sol 1 , . . ., sol t ) where sol 1 , . . .sol t have the same length.We only consider enumeration orders  such that all arrays in the solution part appear exactly once.Such orders visit all atomic variables exactly once for every problem instance.Given the order  and a problem instance  , we can rearrange the atomic variables  into a list [ 1 , . . .,  m ], where each  i is the set of atomic variables to be enumerated at the i-th step.
Below, we present the template for the exhaustive search and its sketch.We fix the enumeration order , the concrete instance  = ⟨, V, , ⟩.

The Exhaustive Search and Its Sketch
In our structures, the exhaustive search and its search sketch follow a template as follows.
Exhaustive search.The template for the exhaustive search is presented in the left half of Figure 9.It follows the order E (denoted by V = [V 1 , . . ., V m ]) of the atomic variables on the problem instance  and recursively enumerates each V i in the procedure Search.
Consider the invocation search(i, V • ), where i is the search stage, meaning that the exhaustive search will currently enumerate the atomic variables V [i], and V • is the partial assignment to the enumerated atomic variables  [1 : i − 1].Whenever the assignment V • is invalid, it returns the invalid result obj_invalid immediately (Lines 5-6).After enumerating all variables, it returns the objective value obj_val(V • ) for the total assignment V • (Lines 7-8).It enumerates the assignment to V [i] and merges the result of all sub-procedure calls using the function obj_merge.The type of the given CP entirely determines the expressions obj_invalid, obj_val(V • ), and obj_merge, which is listed in Figure 9. Since the enumeration order visits each atomic variable exactly once, it is easy to see that the following theorem holds.Theorem 4.2 (Soundness of the exhaustive search).Given any specification and any enumeration order , the exhaustive search algorithm (Figure 9) matches the synthesis goal in Section 3.
Subproblem.Given the enumeration order  and the problem instance  , each invocation search(i, V • ) can be identified as an intermediate subproblem  (i, V • ), whose specification is as follows.
• The atomic variables of the subproblem is  [i : m], which is the set of unknown atomic variables at the search stage i. • The constraint of this subproblem is  (V • ) • The objective of this subproblem is  (V • ).
Search sketch.As discussed in Section 2.2, the subproblems in the exhaustive search does not enable efficient reusing.the memoization partition function (MPF).Thus, we define the template for the search sketch (Figure 9), which follows a similar recursive procedure with the exhaustive search but adds some codes and holes for the LOF and the MPF, which we illustrate in Sections 4.2 and 4.3.

Local Objective Functions
We introduce the LOF and its updating function in our structure of MAs to make the objective function of different subproblems more likely to be equivalent.For CDPs and CCPs, since there is no objective function, the holes Lleaf and Lupd can be trivially filled, as shown in Figure 9. Below, we only discuss the case for COPs.We present the formalization of LOF and its updating function.
Local Objective Function (LOF).The LOF(i,  [i : m]) is a function independent with the set of enumerated atomic variables.For each subproblem  (i, V • ), we replace its objective function from  (V • ) with LOF.We generalize the condition (C1) in Section 2.5 and define the LOF as follows. ∀ where we further restrict that each ⊕  is monotonically increasing with respect to its second argument .Moreover, for  = 1, we restrict that ∀ .[]⊕ 1  = .
Updating function of the LOF .We also need the updating function L upd (i, L, V • i ) to obtain the LOF from those of the child subproblems, where the first parameter is the search stage, the second parameter is the LOF of the child subproblem, the third parameter is the assignment to the currently enumerated atomic variables V [i].It must satisfy the following equation. (5) In the template of the search sketch, for the LOF, we leave two holes: • Lleaf that equals the LOF for the leaf subproblems LOF(m + 1, []), and is applied in Line 16.
• Lupd that equals the updating function L upd (i, L,  • i ), and is applied in Line 22.

Memoization Partition Functions
We introduce the MPF and its updating function to quickly identify whether two subproblems have the same valid solution set.
Memoization Partition Function (MPF).The MPF(i,  [1 ) is a function over the enumerated variables  [1 : i − 1] and the parameters in the original specification.It outputs a tuple of scalars.We generalize the condition (D1) (Section 2.5) and define the MPF as follows.
Updating function of the MPF .We need an updating function M upd (i, M, V • i ), whose first parameter is the search stage , second parameter is the MPF value of the parent subproblem and the third parameter is the assignment to currently enumerated variables V In the template (Figure 9), we add the parameter W in the procedure search to track the MPF value, we also leave two holes: • Minit that equals the initial value for MPF(1, []) and is applied in Line 32.
• Mupd that equals the updating function for the MPF and is applied in Line 24.
In the end, we remark the input parameters are visible to all functions above.We formalize the properties of our structuring as follows.
Theorem 4.3 (Correctness and Efficiency of Our Structure).Given any specification, if there are functions LOF, L upd , MPF, M upd satisfying the conditions ( 4)-( 7), then we can fill the holes Lleaf , Lupd , Minit , Mupd in the search sketch (Figure 9), deriving the MA  mem such that: • (Correctness)  mem is equivalent to the exhaustive search algorithm in Figure 9, thus matches the synthesis goal in Section 3. • (Efficiency) Given any problem instance  = ⟨, V, , ⟩, the number of equivalence classes of the subproblems is  (| | • range), where range the number of different outputs of the MPF on the problem instance  .
Proof.For the correctness part, consider two invocations of the procedure search that correspond to two subproblems  (i, V • 1 ) and  (i, V • 2 ).Suppose they have the same i and the same MPF value MPF(i, In that case, they have the same set of unknown atomic variables V [i : m], the same LOF, and the same valid solution set over V [i : m].Hence, the output of the two invocations must be the same.As a result, we can set up a map mem to safely reuse between these subproblems.

The Outmost Controller
In the outmost iteration (Algorithm 1), SynMem tries enumeration orders from the space (Figure 8) since different enumeration orders yield different MAs.SynMem also iteratively increments the hyperparameters B and K to bound the program space when synthesizing the LOF and the MPF, which will discuss in detail in Section 5.2.If the synthesis fails, it increments these parameters and restarts a new iteration.It relies on two procedures GenSketch and CEGIS, illustrated as follows.
GenSketch.Given a specification and the enumeration order , this procedure generates the exhaustive search and its sketch from the template (Figure 9).Note that the length of the arrays in the solution part, the domain of each atomic variable, and all range expressions in the constraint and objection part only depend on the input parameters, and are fixed given the parameter choice   .
As a result, we can easily deduce the program that takes   as input and outputs other components   ,   , and   in the program instance, completing the generating procedure.CEGIS.To tackle the sketch problem from a reference exhaustive search implementation, SynMem applies the counter-example guided inductive program synthesis framework (Section 2.5) to reduce the original synthesis problem to synthesizing an efficient and correct MA that is equivalent to the exhaustive search for a finite set of problem instances E (Algorithm 2).

Algorithm 1: The Outmost Controller
Input: The specification Output:

Inductive Synthesis of the LOF and the MPF
In each CEGIS iteration, we complete the search sketch (Figure 9).Instead of applying the general sketch method, we apply Theorem 4.3, which reduces the sketch problem to synthesizing the LOF, MPF, and their updating functions with respect to the conditions ( 4)-( 7).These conditions yield two independent tasks on quantified relational program synthesis, enabling an efficient synthesis procedure.Furthermore, we can control the efficiency of the synthesized MA by minimizing the range of the MPF.Below, we present the synthesis of the MPF.After synthesizing the MPF, the synthesis of its updating function (Condition ( 7)) is a conventional SyGuS problem.We assume an external solver U to solve this synthesis task.The synthesis of the LOF (and its updating function) follows a similar procedure, which we sketch at the end of this section.Below, we first present the domain specific language (DSL) for LOF, MPF.
DSL Description.The DSL consists of constants, all input parameters, the search stage i, and the unknown suffix (enumerated prefix, resp.) of each array sol i in the solution part.Since the LOF and the MPF admit updating functions, they must be programs represented by structural recursions.Thus, the DSL consists of the compositions of common structural recursion operators, such as map, filter, sum, max, min, length and suffix, etc.It also consists of primitive operators +, −, ×, logical connectives ∧, ∨, → and the array access operator access.However, we forbid the occurrence of the structural recursion operators and the variable i in the grammar of the function part for higher-order operators.Synthesis of the MPF SynMPF.SynMem synthesizes the MPF from its formalization (6).We present the pseudo-code of this procedure in Algorithm 3. Since in each CEGIS iteration, SynMem only aims to synthesize an MA equivalent to the exhaustive search on a set of problem instances E, SynMem changes (6) by replacing the bound of the problem instance  from all instances to E.
Recall that the output of the MPF is a tuple of scalars.Thus, SynMem treats synthesizing the MPF as the joint synthesis of ℓ programs M 1 , . . .M ℓ , where ℓ is the number of the components in the tuple, and M j is the -th component.SynMem applies two hyperparameters K and B to bound the program space, where K upper bounds ℓ and B upper bounds the number of AST nodes for each M i .The synthesis of the MPF consists of the following steps.
Step 2: Deductive term rewriting.In this step, SynMem further applies a reduction to each collected condition Φ(,  0 , ) to eliminate the quantifier ⊙  0 , .The reduction is accomplished via the deductive term rewriting, which searches an equivalent expression of  0 .SynMem assumes the existence of a term rewriting procedure Rewrite(C 0 ) which can be implemented by a rewriting system suitable for the target DSL [Brillout et al. 2011;Marché 1996;Marché and Urbain 1998;Willsey et al. 2021].Whenever SynMem invokes Rewrite(C 0 ), it either returns "exhausted" meaning the exhaustion of equivalent forms, or a new equivalent expression of  0 .SynMem then applies the procedure Check to check if the rewriting result is in the form ( can reduce the constraint Φ(,  0 , ) into  conditions Ψ(,  • 1 , ), . . .Ψ(,  •  , ), where each Ψ(,  •  , ) is defined as follows, eliminating the quantifier ⊙  0 , : Intuitively, if every  •  equals a component of the MPF MPF   (,  • ) for some   , then we can choose ⊙  0 , as (M  1 (,  • ), . . ., M   (,  • ),  • 1 , . . .,  •  ) to satisfy (8).Below, we show that the reduction in this step is sound and complete, assuming an ideal term rewriting system.Theorem 5.1.If the MPF M satisfies Φ(,  0 , ), then there exists an equivalent expression of  0 in the form ( • 1 , . . .,  •  ,  • 1 , . . .,  •  ) as above such that M satifies (9), and vice versa.Proof.For the "then" side, suppose (8) holds.Then, the truth value of  0 is a function on M and  • .Consider the abstract syntax tree of this function, whose leaf node is either a component of M or an atomic variable in  • .Thus, this function is in the form ( • 1 , . . .,  •  ,  • 1 , . . .,  •  ) as above and M satifies (9).For the "vice versa" side, we follow the intuition above.□ Step 3: Inductive synthesis of (9).After the reduction above, the goal becomes finding ≤ K programs to satisfy all reduced conditions in the form Ψ(,  • , ) collecting in Step 2. SynMem invokes the procedure MultiSynth to solve this problem.The detail of this procedure is as follows.This procedure recursively enumerates M 1 , M 2 , . . .from the space of programs with the number of AST nodes ≤ B. It returns if there have been more than K programs or the current programs M 1 , . . ., M ℓ have satisfied all reduced conditions in the form Ψ(,  • , ), which could be easily checked since the domains of  and  • are finite for every fixed problem instance  .
To obtain the MPF with a small range, SynMem applies a lightweight heuristic search specific to the DSL above to synthesize each component in the MPF with a small range.Note that applying a structural recursion operator (e.g., sum, max) in our DSL shrinks the range of the result.Thus, the larger the synthesized program, the smaller ranges the MPF would have.Thus, SynMem follows the heuristics that enumerates the programs (with AST size ≤ B) from large to small in the procedure MultiSynth and outputs the first successful result.
However, the procedure above is too slow.We can prune the synthesis procedure by reformulating the synthesis task in this step.We say that a program M * hits the condition Ψ(, In other words, the condition Ψ(,  • , ) will be satisfied if we add M * to a component of the MPF.From this perspective, the goal of this step becomes finding ≤ K programs that hit all conditions in the form Ψ(,  • , ).Then, we apply the pruning based on the maximum degree bound proposition [Bläsius et al. 2022], which is given below.Proposition 5.2.If there are ≤ K programs M 1 , . . ., M ≤K that satisfy all conditions, then there exists some M i among them that covers ≥ 1 K fraction of the conditions.Proof.Assume that each  ∈ L covers < 1 K fraction of the conditions, then for all  programs  1 , • • • ,   ∈ L( ≤ K), they can cover <  K ≤ 1 fraction of the conditions at most, which contradicts the fact that exists ≤ K programs can cover all conditions.□ By the proposition above, when enumerating the program M i , we only consider the programs with AST size ≤ B and hits ≥ 1 K−i+1 fraction of conditions, which greatly excludes invalid programs and significantly speeds up the synthesis procedure.
If SynMem successfully finds ≤ K programs that satisfy all conditions with the minimized range, then SynMem finds the MPF M by tupling these programs together.Then, SynMem invokes the external synthesizer U to synthesize the updating function for the MPF to fill the hole Mupd , and evaluates M(1, []) to obtain the initial value for the MPF to fill the hole Minit .
Backtracking.However, if the synthesis in Step 3 fails to find ≤ K programs that satisfy all conditions, then SynMem backtracks to Step 2 to search for another term rewriting result, and starts Step 3 again.Suppose the number of failing trails in Step 3 exceeds a prescribed number Lim or the term rewriting result is exhausted.In that case, we exit Step 2 and report failure to synthesize the programs for the holes Minit and Mupd .Synthesis of the optimal substructures SynLOF.Below, we present SynLOF, which synthesizes the LOF and its updating function.Note that for CCPs and CDPs, this procedure is trivial since there is no optimization goal.We list the synthesis result for CCPs and CDPs in the template of the search sketch (Figure 9).For COPs, SynLOF follows almost the same procedure as SynMPF (Algorithm 3).Hence, we only illustrate the differences.
Step 1.In this step, SynMem collects conditions for the LOF following the condition (4).(Line 4) Step 2. In this step, SynMem rewrites the conditions collected in Step 1 into the form ⊕( • ,  • 1 , . . .,  •  ) (Line 10), where  • only depends on the unknown atomic variables  [i : m], and every  •  only depends on the enumerated atomic variables  [1 : i − 1].Furthermore, SynMem also checks whether: (i) For  = 1, ∀ .[]⊕  = , and (ii) the combinator ⊕ is monotonically increasing with respect to  • .Checking the first condition is trivial, and the second condition is checked by validating the following formula.
The checking is easy since for every problem instance  , the choice of  • 1 ,  • 2 and  • is finite.
Step 3. Next, since SynMem only needs to synthesize only one LOF, we set K = 1 in this step.
In the end, we discuss the properties of our algorithm.Soundness.Note that all steps 1-3 above are sound.Thus, given a sound and complete CEGIS verifier and a sound external synthesizer U for synthesizing the updating function, if SynMem successfully synthesizes the LOF, MPF, and their updating functions, then these functions satisfy all conditions (4)-( 7).Furthermore, we can plug these functions into the search sketch, deriving a correct and efficient MA.Completeness.As for the completeness, first note that by Theorem 5.1, Step 2 above is complete as long as the set of equivalent forms of  0 is recursively enumerable, which means there is an algorithm that enumerates all equivalent forms of a given expression.Such an algorithm implements an ideal term rewriter for the procedure Rewrite(C 0 ).Moreover, the procedure MultiSynth is complete since it enumerates all possible combinations of programs, and the pruning by Proposition 5.2 preserves completeness.Hence, our algorithm is complete, in the sense that it never excludes valid MPFs and LOFs, as long as (1) the CEGIS verifier is sound and complete; (2) the underlying DSL for enumerating the program is expressive enough, (3) the external synthesizer U is complete, and (4) the set of equivalent expressions is recursively enumerable.
We remark that finding a sound and complete external synthesizer U is easy, which simply enumerates every program with AST size ≤ B. As for the recursively enumerable condition, we present a case study as follows.
Case study.Consider a CP specification as follows.We disallow multiplications and divisions across variables and allow the widely-used (i) primitive operators +, ×, max, and min; (ii) logical connectives ≤, ≥, ≠, ∧, ∨, and ¬; (iii) recursive operators sum, product, max, min and forall.In this case, the constraints and the objective function are CLIA formulas over variables under a given instance.Thus, we can encode them as an expression in the Presburger arithmetics [Presburger 1931] with uninterpreted functions (if we treat an array as an uninterpreted function [Bradley Proc [Shostak 1979].Hence, all equivalent forms are recursively enumerable.We found that 38/40(95%) of our benchmarks fall into this case.

Optimizations
Besides the pruning in Proposition 5.2, SynMem also applies other optimizations to speed up the synthesis procedure.
Filtering out unnecessary conditions.In SynMPF, SynMem filters out constraints  0 ∈   whose variables var( 0 ) is a subset of  [1 :  − 1] or  [ : ] to simplify the specification.This does not affect the correctness of the MPF because if var( 0 ) ⊆  [1 : i − 1], then  0 is a constant after enumerating  [1 : i − 1].Thus, there is no need to consider this constant constraint.On the other hand, if var( 0 ) ⊆  [i : m], then  0 is fully determined by the unknown variables  [i : m].We can trivially choose ⊙  0 , as  0 itself.
Pruning invalid rewriting.In Step 2 of SynMPF, consider the subexpressions  • 1 , . . .,  •  in the rewriting result (Line 10).If there are more than K semantically different subexpressions (which could be checked by evaluating these subexpressions over choices of  • ), then it is impossible to use ≤ K functions to hit all conditions Ψ(,  • 1 , ), . . ., Ψ(,  •  , ).Thus, we can safely discard this rewriting result and try the next rewriting.The same pruning holds for the procedure SynLOF.
Lightweight monotonicity checking.In Step 2 of SynLOF, SynMem rewrites each collected constraint in Step 1 into the form ⊕( • ,  • 1 , . . .,  •  ) and checks whether ⊕ is monotonically increasing with respect to  • .This could be achieved by scanning over  • 1 ,  • 2 ,  • as in ( 11), which might be costly.Thus, SynMem applies a syntactical checking in advance.Consider the abstract syntax tree of ⊕, SynMem extracts the path from  • to the root node and checks if every primitive operator in this path is monotone (e.g., +, min, max, etc.).If the lightweight monotonicity checking does not apply, SynMem invokes the original checking procedure.

Implementation
In this part, we present the details of the implementation of our algorithm.

CEGIS.
It is a highly non-trivial task to automatically verify the correctness of a MA on all problem instances.Thus, we consider the bounded testing method [Lee and Cho 2023;Miltner et al. 2022] that verifies the synthesized program with instances that fall into a prespecified range.In our implementation, we consider (randomly generated) 100 instances where both arrays of length and each input component fall into the interval [1, 5].If scalable verification algorithms are developed in the future, we can also use these algorithms in the CEGIS part.
Term rewriting.The rewriting is an intricate procedure whose performance depends on the operators and syntactical structures of the given expression and the algebraic rules applied for term rewriting.In SynMem, after applying CEGIS, the constraints and the objective function only consists of primitive operators (e.g., +, ×, max, min, etc.).Hence, SynMem applies basic algebraic rules for the associativity, commutativity, and distributivity between primitive operators so that SynMem can perform each rewriting step efficiently.We apply the breadth-first search for the term rewriting procedure.
External updating function synthesizer U. We design a sound and complete U specific to the DSL above.Since the DSL for LOF and MPF only consists of the compositions of structural recursions, the updating function for LOF and MPF could be generated syntactically due to the restrictions of the DSL above.For example, consider generating the updating function for sum(filter(, )).

EVALUATION
In this part, we evaluate our approach against the baseline Sketch [Solar- Lezama et al. 2006].
Dataset.We collect CPs from Leetcode [lee [n. d.]], National Olympiad in Informatics in Provinces-Junior (a national-wide programming contest in China) [NOI [n. d.]] and previous approach [Pu et al. 2011].We formalize these problems into our specification form.In detail, for Leetcode, we consider problems tagged with "dynamic programming" with the highest frequencies.Leetcode maintains the frequency statistics for each problem to represent the probability that this problem appears in a real-world interview.For National Olympiad in Informatics in Provinces-Junior, we consider dynamic programming tasks (tagged by ICPC gold medal winners) in the past ten years.For benchmarks in the previous approach [Pu et al. 2011], we collect CPs that are not included by Leetcode and the algorithmic contest.We exclude problems that are either mistagged or not expressible in MiniZinc.In summary, we collect the top 36 tasks from Leetcode with the highest frequencies, 4 tasks in algorithmic contests, and 2 benchmarks from the previous approach.
Our benchmark consists of a wide range of classic dynamic programming tasks.Below we list some representatives.For COPs and CDPs, it consists of the Knapsack problem (KP), the longest increasing/common subsequence problem (LIS/LCS), the shortest path problem on grids, the maximum segment/independent sum problem, and the maximal multi-marketing problem.For CCPs, it consists of computing the Fibonacci/Catalan/binomial numbers and the counting variants of the KP/LCS/LIS.
• Sketch is a general solver for sketch problems.We compare with Sketch since SynMem generates and completes the search sketch, and the comparison indicates the effectiveness of our synthesis procedure (Section 5.2).For each problem in our benchmark, we feed the generated search sketch (Figure 9) into Sketch.• FOSynth is the state-of-the-art approach for synthesizing MAs from declarative specifications.
FOSynth is also sketch-based, but their sketch template is less expressive than ours (Figure 9).The original implementation of FOSynth is unavailable.Hence, we acquired its reimplemented version by contacting an author of FOSynth.This implementation includes a built-in DSL and does not support easy change of DSL.Nevertheless, this DSL is a strict subset of the DSL used in our approach (Section 5.2).
Procedure.We execute our implementation and the two baselines.We set the time limit as one hour for solving an individual benchmark.We obtain all results on the laptop with the Intel(R) Core(TM) i7-7820X CPU, 40GB RAM, and the Ubuntu 20.04 system.
Results.Overall, SynMem solves 39/42 (92.8%) of our benchmarks.On 33 benchmarks, SynMem successfully finds the best algorithm.On the solved benchmarks, SynMem takes 2.01s on average.More specifically, on average, SynMem takes 1.91s on inductive synthesis and 0.10s on deductive term rewriting.Furthermore, after manually checking, we find that SynMem synthesizes the MPF with a minimal range on all solved benchmarks.Please refer to the full version of this paper for experimental results in detail, where we report the running time of SynMem, the time complexity  1.For each approach, columns #Solved and #Failed are the number of solved and failed benchmarks, respectively.Column #Best is the number of benchmarks where the complexity of the synthesized program matches the complexity of the reference answer.Column AvgTime is the average running time per solved task.
Compared with the baseline, note that Sketch cannot synthesize any benchmark, and FOSynth solves 8 benchmarks with an average time of 58.31s and timeouts on 34 benchmarks.In contrast, SynMem successfully synthesizes 39 benchmarks in a shorter average time.Thus, in our benchmark, SynMem beats all baseline approaches.
Discussion.Sketch fails on all benchmarks for the following reasons.
• First, the search sketch is too complex.It has about 30 lines and involves recursions, global array access, and modifications.In contrast, SynMem applies dedicated specifications for the holes based on the definition of LOF, MPF, and their updating functions.• The dedicated specifications yield two independent tasks in each CEGIS iteration.SynMem solves them separately, leading to an efficient synthesis procedure.However, Sketch has to complete all holes in the search sketch simultaneously.
Our limitations.SynMem fails on 3/40 tasks in our benchmark.This is because, on these benchmarks, the target memoization algorithm is out of reach of our template.Consider the following example of our failure from Leetcode 698: Given an integer array nums and an integer K, return true if it is possible to divide this array into K non-empty subsets whose sums are all equal.
We fail on this task since it requires us to synthesize the MPF that produces an unbounded list rather than a tuple.However, our method only supports the MPF that outputs tuples of a fixed dimension.Considering more forms of MPF is the future work.On 9/37 benchmarks, SynMem synthesizes a sub-optimal MA that has a polynomial gap on the complexity with the reference answer.This is because these benchmarks require further algorithmic techniques other than memoization.Consider the following example from Leetcode 115: SynMem successfully synthesizes the basic  ( 2 ) MA.However, this problem requires an extra data structure for range query [He et al. 2011] to optimize the MA into running time  ().

RELATED WORK
Deriving Memoization Algorithms.There have been multiple trials to derive memoization algorithms, which could be categorized as manual and automated approaches.
First, there are manual or semi-automated approaches.Some of them [Bird and de Moor 1997;Bird and Gibbons 2020;de Moor 1995;Morihata et al. 2014;Mu 2008] propose a calculational framework so that the user can manually make a step-by-step derivation of memoization algorithms.Others of them [Acar et al. 2003;Giegerich et al. 2004;Liu and Stoller 1999;Pettorossi and Proietti 1996;Sauthoff et al. 2011] requires the user to specify complete memoization algorithms in some DSL, including the MPF and the local objective.These approaches verify whether it is correct.In contrast, our approach could automatically synthesize the memoization algorithm from a declarative specification, where the user only needs to provide a high-level description.
Next, there are automated approaches [Lin et al. 2021;Pu et al. 2011] that are closely related to ours.Below, we compare them with ours separately.Lin et al. [2021]'s approach also considers specifications in MiniZinc style.Following a simple deductive procedure, their approach directly transforms the constraints and the objective function into a fold expression.It is limited as follows.First, their transformation rules support only a limited set of operators.For example, they could not handle the forall operator to perform element-wise operations over arrays.Next, their transformation succeeds only when there is no constraint between any two elements in inductive data structures.In our benchmark, they are applicable to only 6/42(14.2%)benchmarks.We do not compare with this approach in the evaluation part since the implementation is not avaliable.
The other approach [Pu et al. 2011] is purely inductive.Similar to SynMem, it also uses a sketch template to synthesize dynamic programming algorithms.It then applies an optimized version of Sketch to solve the synthesis task.However, it handles dynamic programming algorithms with a fixed number of scalar values for memoization.Thus, It is not able to handle classic CPs such as 0-1 knapsack, our running example in Section 2. By contrast, we consider a much more fruitful template.We introduce the MPF to represent an unbounded number of values for memoization.Under our framework, their template could be viewed as a subclass of ours where the MPF simply outputs 1 on all subproblems.To handle the more general synthesis problem, we apply a new algorithm mixing deductive and inductive synthesis methods.In our benchmark, 11/42(26.1%) of our benchmarks fall into their sketch template.In our experiment (Section 6), their approach solves 8/42(19.1%) of our benchmarks.
Finally, all previous approaches do not consider CCPs.We first address CCPs via our versatile sketch template and a dedicated approach to synthesizing the MPF.
Program Synthesis.Our approach relates to previous work in program synthesis as follows.
Recursive Program Synthesis.Many existing methods synthesize recursive programs [Farzan et al. 2022;Feser et al. 2015;Hu et al. 2021;Itzhaky et al. 2021;Kitzelmann and Schmid 2006;Kneuss et al. 2013;Knoth et al. 2019;Lubin et al. 2020;Polikarpova and Sergey 2019].However, as far as we know, no approach could scale up to the synthesis of a complex memoization algorithm, which often involves tens of lines of code.
Sketching.SynMem follows a template of search sketches and proposes a dedicated method to synthesize all holes in the search sketch.However, the general solver Sketch for program sketching uses a constraint-based method, completely blind to the rich information in the specification.We have compared with Sketch in detail in Section 6.
Synthesizing specialized algorithms.Other approaches have been proposed to synthesize a specialized class of algorithms automatically [Farzan andNicolet 2017, 2021;Morita et al. 2007;Smith and Albarghouthi 2016].However, none is concerned with deriving efficient memoization algorithms.
At a more specific level, our approach shares some similarities with the previous work [Farzan and Nicolet 2017].Both approaches apply the term rewriting techniques to complex relational program synthesis tasks, reducing a relational synthesis task into a conventional SyGuS task.However, instead of directly applying rewriting techniques (as in [Farzan and Nicolet 2017]), SynMem first applies a CEGIS procedure to instantiate (C1) and (D1) on concrete instances, removing operators that are difficult to be coped with in a rewriting system, such as the summation operator Σ.In this way, the design of the rewriting system becomes much simpler, and in many cases, we can guarantee the success of rewriting.
Relational Program Synthesis.The synthesis conditions in Section 4 yield two relational program synthesis tasks.However, our synthesis tasks involve too many unknown functions, thus is beyond the reach of the previous approach [Wang et al. 2018] in this field.To handle these tasks, SynMem applies the term rewriting method to bypass the synthesis of a large proportion of unknown functions and reduces the quantified relational synthesis task into a conventional SyGuS task.

CONCLUSION
This paper addresses the automated synthesis of correct and efficient memoization algorithms from the given declarative specification.We first make a novel reduction from synthesizing memoization algorithms to two smaller program synthesis tasks.However, the generated synthesis tasks are still too complex to be resolved by existing synthesizers.Thus, we propose a novel synthesis algorithm that combines the deductive and inductive methods to solve these tasks.Our approach successfully synthesizes 39/42 problems, outperforming the baselines.
Fig. 2. The search algorithm

Fig. 5 .
Fig. 5.The execution of the MA on the parameter  0 , where LOF is the LOF, and W is the MPF value.

Fig. 8 .
Fig. 8.The syntax and the semantics of enumeration orders

Fig. 9 .
Fig.9.The pseudo-code for the template for the exhaustive search (left) and its sketch (right).The holes Lleaf , Minit , Lupd , Mupd need to be filled by synthesized expressions.
To optimize the search, the MAs introduces the local objective function (LOF) and Proc.ACM Program.Lang., Vol. 7, No. OOPSLA2, Article 225.Publication date: October 2023.

Table 1 .
The Comparison Result