MedPart: A Multi-Level Evolutionary Differentiable Hypergraph Partitioner

State-of-the-art hypergraph partitioners, such as hMETIS, usually adopt a multi-level paradigm for efficiency and scalability. However, they are prone to getting trapped in local minima due to their reliance on refinement heuristics and overlooking global structural information during coarsening. SpecPart, the most advanced academic hypergraph partitioning refinement method, improves partitioning by leveraging spectral information. Still, its success depends heavily on the quality of initial input solutions. This work introduces MedPart, a multi-level evolutionary differentiable hypergraph partitioner. MedPart follows the multi-level paradigm but addresses its limitations by using fast spectral coarsening and introducing a novel evolutionary differentiable algorithm to optimize each coarsening level. Moreover, by analogy between hypergraph partitioning and deep graph learning, our evolutionary differentiable algorithm can be accelerated with deep graph learning toolkits on GPUs. Experiments on public benchmarks consistently show MedPart outperforming hMETIS and achieving up to a 30% improvement in cut size for some benchmarks compared to the best-published solutions, including those from SpecPart---moreover, MedPart's runtime scales linearly with the number of hyperedges.


INTRODUCTION
Hypergraphs are a natural extension of traditional graphs, representing connections among more than two vertices through hyperedges.They are, therefore, particularly adept at modeling complex multi-way relationships, rendering them invaluable across many fields [4].In particular, a hypergraph problem of critical importance in VLSI is the balanced min-cut netlist partitioning problem [10].This problem aims to divide the netlist hypergraph into two or more nearly equal-sized parts while minimizing the number of hyperedges (=nets) connecting vertices (=gates/modules) in different partitions.It is a fundamental combinatorial optimization problem with direct applications to floorplanning, placement, and the latest 3D ICs tier-partitioning.

Related Works
State-of-the-art hypergraph partitioners, such as hMETIS [10], KaHy-Par [16], and PaToH [8], usually adopt a multi-level paradigm, progressively coarsening hypergraphs to explore a vast solution space efficiently.These coarser partitions then serve as starting points for finer-level refinement.Such a paradigm tends to be scalable because it focuses on partitioning smaller, more manageable, coarser-level graphs, reducing the computational burden.However, multi-level partitioners may encounter local optima in practice due to two critical limitations outlined in [7]: (i) Hypergraph coarsening predominantly considers local structures, neglecting global hypergraph characteristics, (ii) Refinement heuristics can become trapped in local minima.In response, SpecPart [7] introduces spectral information to refine partitioning, albeit reliant on initial solutions.However, when the initial solution is far from the global optimum, SpecPart may still fall short of achieving global optimality.
Evolutionary algorithms, such as genetic algorithms (GA) [5], have found applications in hypergraph partitioning.While these algorithms excel in systematic exploration in discrete space, they often lack efficiency in local search.Consequently, prior research has commonly resorted to hybrid approaches that combine evolutionary algorithms with local search techniques.In addition, evolutionary partitioners necessitate a substantial number of evolution generations to converge, resulting in heavy evaluation workloads, especially for large hypergraphs.
A graph neural network (GNN)-based graph partitioner introduced in [15] defines a differentiable loss function representing the partitioning objectives.It employs backward propagation to optimize the GNN parameters, enabling the GNN to predict partitioning solutions, even for previously unseen graphs.However, its loss function involves multiplications matrices of size  ×  , where  represents the number of vertices, limiting its scalability.Moreover, it is not designed to handle hypergraphs.

Contributions
In this work, we develop a multi-level evolutionary differentiable hypergraph partitioner named MedPart.It follows the multi-level paradigm but addresses its limitations by fast spectral coarsening and a novel evolutionary differentiable optimizer with a global view at each level.Moreover, by analogy between hypergraph partitioning and deep graph learning, our evolutionary differentiable optimizer can be accelerated with deep graph learning toolkits on GPUs.Our contributions are summarized as follows: (1) We introduce a fast spectral hypergraph coarsening algorithm based on emerging graph signals.It can progressively coarsen a graph with hundreds of thousands of nodes in seconds.(2) We propose an evolutionary differentiable algorithm that integrates GA and gradient descent (GD) for optimization at each coarsening level.In the GD search, we place a probability on assigning a vertex to each partition.With a differentiable and computing-efficient loss function, GD optimizes the assignment probabilities end-to-end.GA is employed to systematically generate good starting points to help the GD escape local optima.(3) Our framework generalizes to many constraint-driven partitioning problem formulations.As long as the loss function can be optimized with differentiable optimization, both discrete and continuous objectives can be targeted, thereby expanding the space of problems that can be solved beyond traditional min-cut and ratio cut bipartitioning formulations.(4) We accelerate the evolutionary differentiable algorithm with deep graph learning toolkits on GPUs by drawing an analogy between hypergraph partitioning and deep graph learning.(5) Experimental results applied to balanced hypergraph bipartitioning on publicly available benchmarks show that MedPart consistently outperforms the leading partitioner hMETIS, and achieves up to a 30% improvement in cut size compared to the best-published solutions for some benchmarks-moreover, Med-Part's runtime scales linearly with the number of hyperedges.

PRELIMINARY
This section presents some preliminaries necessary for the understanding of MedPart.We first offer the mathematical framework for hypergraph partitioning, our problem at stake.We then introduce the spectral coarsening and genetic algorithms used in MedPart.Finally, we present the mechanism of message passing in graph neural networks, which will serve as the basis for efficient implementations of MedPart routines.

Hypergraph Partitioning Formulation
A hypergraph  is defined as a pair  = ( , ) where  represents the set of vertices  ∈  with associated weight   , and  represents the set of hyperedges where an hyperedge  ∈  is a subset of  with associated weight   .Given a positive integer  ≥ 2 and a positive real number  ≤ 1  , letting  =  ∈   , the -way balanced hypergraph partitioning problem can be mathematically formulated as: where Eq. ( 2) ensures that  is a -way disjoint partitioning solution of  , and  is the allowed imbalance between partitions (Eq.( 3)).We say that  is an -balanced partitioning solution.
For simplicity but without loss of generality, this work focuses on traditional bipartitioning scenarios where  = 2,   = 1, ∀ ∈  , and   = 1, ∀ ∈ .Note, however, that our framework readily applies to more general scenarios.

Spectral Graph Embeddings and Coarsening
Let  = ( , , ) be a weighted graph.The Laplacian matrix   of G is defined as follows: Suppose the eigenvalues of   are  1 ≤  2 ≤ . . .≤   and the corresponding eigenvectors are  1 ,  2 , . . .  , where   , 1 ≤  ≤ N is a vector of length of  (# of vertices).An effective way to represent the graph's global structure information is to embed the graph into an -dimensional space using the first  (1 ≤  ≤  ) eigenvectors of the graph Laplacian matrix, also known as the spectral graph embedding technique.Next, the graph vertices close to each other in the low-dimensional embedding space can be aggregated to form the coarse-level nodes and, subsequently, the reduced graph.However, calculating the eigenvectors of the original Laplacian graph is very costly, especially for large graphs.
A fast spectral coarsening method is developed in [9] based on emerging graph signal processing techniques.A graph signal  = { 1 ,  2 , . . ., g  } is defined as a vector that ensembles the individual values on all vertices.A random graph signal  can be expressed with a linear combination of eigenvectors of the graph Laplacian, i.e.,  =  =1     .Instead of directly using the first few eigenvectors of the original graph Laplacian as the graph embedding, [9] proposes to apply a low-pass graph filtering function to  random graph signal vectors to obtain smoothed vectors for -dimensional graph embedding, which can be achieved in linear time.Applying the smoothing function on , a smoothed vector g can be obtained as the linear combination of the first few eigenvectors, i.e., g =  =1 α   ,  ≪  .It is suggested in [9] that these smoothed graph signals preserve important spectral properties.

Genetic Algorithms
A general framework of GA is outlined in Alg. 1.The GenOffspring function, as indicated in Line 3, generates a fresh set of solutions referred to as "offspring." These offspring are created based on the genetic operations of mutation and crossover applied to the individuals in the current population.The UpdatePopulation function in Line 5 incorporates the offspring into the existing population based on the fitness scores of the current population and the offspring.
Maintaining population diversity is critical in GAs to prevent premature convergence and effectively explore the entire search space.Various offspring generation and population update methods have been proposed to ensure diversity.For example, tournament selection [17] is a commonly used method for selecting individuals from a population to serve as parents for the next generation.It population, fitness_scores = UpdatePopulation(population, fitness_scores, offspring, offspring_fitness_scores) 6 end 7 return best solution among population mimics a tournament-style competition among individuals to determine who will be chosen as parents.Deterministic crowding [13] is a population update mechanism that ensures diversity by replacing a parent with its offspring only if the offspring is more fit and genetically similar to the parent, aiming to explore distinct regions of the solution space.In addition, GAs, while adept at systematic exploration in discrete spaces, often lack efficiency in local search, leading to common hybrid approaches that combine evolutionary algorithms with local search techniques in prior research.

Message Passing in Graph Neural Networks
Message passing is a core process in GNNs through which vertices in a graph communicate and exchange information with their neighboring nodes to aggregate and update their features.Fig. 1 outlines the message-passing process.Each vertex in the graph is associated with an initial feature vector.The GNN computes a message for each vertex by combining information from neighboring vertices and applying transformations to the aggregated features.Deep graph learning toolkits, such as Deep Graph Library (DGL), accelerate message passing in GNNs by leveraging efficient data structures, advanced caching and memoization techniques, parallelism, and GPU acceleration.algorithm from the coarser level to the finer level.There, the coarser partitions serve as starting points for finer-level refinement.We introduce the following notations to simplify the presentation.

MEDPART MULTI-LEVEL OPTIMIZATION 3.1 MedPart Overview
• The integers  0 >  1 > . . .>   denote the vertex counts at different levels of graph coarsening.Fig. 2 (a) illustrates the notion of graph coarsening levels, where level 0 represents the finest granularity, and level  is the coarsest level.•  () represents a partitioning solution at level , which is a matrix of size   × .Each row of  () is a one-hot vector encoding the partition block assignment of a vertex.Without causing confusion, we will omit the subscript  in  ().• x (), also a matrix of size   × , represents the continuous relaxation of the partitioning solution at level .In this matrix, each (, )-th element corresponds to the probability of assigning the -th vertex to the -th partition block.It is important to note that for any vertex 1 ≤  ≤   , the sum of probabilities across all partition blocks is equal to 1. •  ← represents the binary projection matrix of size   ×   that maps a partitioning solution at level  to level .In Fig. 2, suppose a partitioning solution at level 2 is  (2) = 1 0 0 1 and , the corresponding partitioning solution at level 1. Please note that the projection matrix  ← can also be applied to continuous partitioning solution x ().Furthermore, to reduce the memory overhead, we implement  ← as a sparse matrix.

Spectral Coarsening and Multi-Level Optimization
MedPart follows the multi-level paradigm and uses a fast spectral coarsening technique to build the graph's coarsening levels.Before coarsening, the input hypergraph  = ( , ) is transformed into a clique expansion graph [2],   , by replacing each hyperedge  ∈  with an edge for each pair of vertices in the hyperedge.Fig. 2 (a) illustrates the construction of a clique expansion graph from a hypergraph.Note that   has precisely the same vertices as  .
Hence, the coarsening results on   also apply to  .Next, we apply to   the graph signal processing-based fast spectral coarsening method discussed in Section 2.2.This process gradually reduces   to a graph with only few vertices, potentially just 2 or 3, and concurrently constructs the projection matrices  0←1 ,  1←2 , . . .,  −1← .
Once the graph coarsening levels have been established, Med-Part generates partitioning solutions progressively, starting from the coarsest granularity level and advancing to the finest granularity, as depicted in Fig. 2 (b).For levels where the total number of all possible solutions (=2 enumeration to create the optimal partitioning solution.In cases where the number of potential solutions is too large, our evolutionary differentiable algorithm is employed to generate a population of high-quality solutions.These coarser-level solutions are then mapped to solutions at finer levels using the projection matrices  ← and act as starting points for the optimization process at those finer levels.Upon completing the optimization at level 0, the finest granularity level, we report the best-found partitioning solution. Note that solution  () at level  can be mapped to solution  (0) at level 0 by  (0) =  0←1  1←2 • • •   −1← •  ().Thus, an evaluation framework developed for the original hypergraph  can be seamlessly applied to assess solutions at any coarsening level.

EVOLUTIONARY DIFFERENTIABLE HYPERGRAPH PARTITIONING
We develop an evolutionary differentiable algorithm that combines a genetic algorithm with gradient descent to optimize partitioning at each graph coarsening level.Thanks to their fast convergence and robust scalability, GD methods find widespread application in optimizing extensive continuous problems.By introducing continuous relaxation to the partitioning space, where each vertex's assignment to a partition block is associated with a probability, we effectively re-frame the partitioning problem to align with the GD optimization framework.Subsequently, after obtaining continuous partitioning solutions via GD, we convert them into discrete partitions by assigning each vertex to the block with the highest probability.It is essential to highlight that the initial solutions greatly influence the performance of GD, and poor initialization can result in getting stuck in local minima.Recognizing this, we utilize GA to generate favorable starting points, enabling GD to escape local optima more effectively.Alg. 2 presents our evolutionary differentiable algorithm.It starts with an initial population of partitioning solutions, evaluating their fitness scores.It leverages GA and GD to improve these solutions iteratively.GA generates offspring solutions by crossover and mutation in each generation, which are fine-tuned using GD.GD involves the transformation of solutions into a continuous space and their iterative refinement to search for better solutions.At predefined checkpoints, solutions are discretized and evaluated for fitness.After the GD epoch, GA updates the population according to the GD outcome.The algorithm reports the best-discovered solutions upon reaching the generation limit or the stagnation threshold.

Genetic Algorithm for Partitioning
In this subsection, we delve into the details of the GA component within Alg. 2, specifically focusing on the GenOffspring function in Line 3 and the UpdatePopulation function in Line 23.We utilize tournament selection in GenOffspring and a deterministic crowding technique in UpdatePopulation to address the diversity concerns discussed in Section 2.3.
Alg. 3 outlines the process of offspring generation by crossover and mutation for partitioning.The tournament selection function TournamentSel(X, scores(X), N_cp) in Line 3 selects individuals from a population to serve as parents for the next generation.Each tournament involves random selection of N_cp individuals from the population X, and then outputs the one with the best fitness score.Crossover( 1 ,  2 , N_cp) in Line 8 represents N_cppoint crossover of  1 and  2 , while Mutate(, p) in Lines 9 and 14 represents random mutation of  with probability p.
It is essential to consider the concept of "permutation symmetry" in partitioning, where rearranging vertices between two partition blocks maintains overall quality.For instance, in a 4-vertex graph, assigning vertices 1 and 2 to block 1 and vertices 3 and 4 to block 2 is equivalent to assigning vertices 1 and 2 to block 2 and vertices 3 and 4 to block 1.This property is important when performing crossover operations on two partitioning solutions.As a result, we align the solutions before crossover.The alignment process is outlined in Lines 5-7 in Alg. 3, where || * || 1 is the L1-norm and ¬ inverts the bits (=rearranges vertices between two partition blocks).
In UpdatePopulation (Line 23 in Alg. 2), we employ the deterministic crowding replacement technique.If an offspring  results from mutating its parent  and exhibits a superior fitness score,  is substituted with .When an offspring  is generated through the crossover of parents  1 and  2 ,  1 is replaced by  only if  1 is more similar to  (as measured by the L1-norm) than  2 , and  possesses a better fitness score.Similarly,  2 is substituted with  if  2 exhibits greater similarity to  and  boasts a superior fitness score.An offspring will replace its parents if and only if the offspring is both more fit and genetically similar to the parent.Compared with greedy replacement methods, deterministic crowding prioritizes diversity and can lead to more robust solutions over time.
Tournament selection, crossover, mutation, and deterministic crowding replacement operations can seamlessly and efficiently be implemented using tensor operators in PyTorch [11].This capability paves the way for harnessing the power of GPU acceleration in GA.

Differentiable Hypergraph Partitioning
Here, we provide an in-depth explanation of the gradient descent component within Alg. 2 (specifically, Lines 5 to 20).

Continuous Relaxation and Differentiable Costs.
In -way partitioning, each vertex within a hypergraph is assigned to one of the  distinct partition blocks.To make the search space continuous, we relax the categorical allocation of vertex  to a partition block using a softmax function over all partition blocks: where  represents a hyper-parameter named temperature.Values  , , ( ∈  , 1 ≤  ≤ ), parameterized by  , ∈ R, can be interpreted as the probability of assigning the vertex  to partition block .Consequently, the partitioning task reduces to learning a set of continuous variables   = { , }.Furthermore, to bridge the gap between continuous and discrete solutions during GD optimization, we progressively increase the temperature  to enforce convergence to a unique partition decision for each vertex.
The expectation of total vertex weight on a partition block  is: Then the -balanced constraint (Eq.( 3)) can be relaxed into a differentiable objective: where ReLU() = max{0,  }.
We also devise four differentiable proxies for the cut size, as depicted in Fig. 3 (b).These proxies are based on a matrix, denoted as , which represents the probabilities of partition block assignments for all vertices within a hyperedge.The dimensions of matrix  are  by , where  corresponds to the number of vertices in the hyperedge.In the example provided in Fig. 3 (b), the matrix  is as . Each row in  represents the probability distribution of partition block assignment for a vertex.Our cut size proxies are primarily designed to assess the similarity among these probability distributions for all vertices within a hyperedge.Greater similarity corresponds to a smaller cut size.Our four proposed cut size proxies are as follows: MeanEntropy() = sum(entropy(Mean(,  = 0))) ( 7) Here, we adopt notations from the PyTorch library.The operations denoted by prod(), mean(), sum(), max(), and entropy() can be realized using corresponding tensor operators in PyTorch bearing the same names.Additionally, MSE refers to the calculation of the mean squared error.Fig. 3 (b) provides an illustrative example for the calculation of our cut size proxies.A larger value of ProdSum corresponds to a smaller cut size, while smaller values of the other proxies also indicate a smaller cut size.The final cost function is defined as a weighted sum of the balanced objective (Eq.( 5)) and the four cut size proxies (Eqs.(6-9)), where the weights are hyper-parameters.

Interaction with GA.
As illustrated in Line 8 of Alg. 2, our GD optimization starts with initial solutions derived by applying the Relax() function to the offspring generated through GA.The x = Relax() operator, transforming a binary solution  to a continuous counterpart x, is defined as follows: where 0 ≤  ≤ 0.5 is a hyper-parameter.
Inversely, the  = Discretize( x) operator in Line 12 of Alg. 2 transforms a continuous solution x to a discrete solution  by applyting argmax to each row of x.Additionally, the MOD() function in Line 11 represents the modulus operator.

Acceleration By Deep Graph Learning Toolkits
When executed on GPUs, our evolutionary differentiable optimization process can be significantly accelerated using deep graph learning toolkits, such as DGL [18].This acceleration capitalizes on the analogy between hypergraph partitioning and deep graph learning and is facilitated by a specialized heterogeneous graph named the Hypergraph-Node Relationship (HNR) graph.As depicted in Fig. 3, the HNR graph comprises two distinct vertex types: node vertices, corresponding to nodes within the given hypergraph, and hyperedge vertices, representing the hyperedges in the original hypergraph.Edges are introduced between a node vertex and a hyperedge vertex only if the corresponding nodes and hyperedges are affiliated in the original hypergraph.We have identified two critical analogies between deep graph learning and hypergraph partitioning using the HNR graph framework: (1) Cut size evaluation as forward propagation on HNR graph: In deep graph learning, forward propagation resembles a sequence of message passing steps, as illustrated in Fig. 1.This process computes messages for each vertex by aggregating information from neighboring vertices and applying feature transformations.
In intermediate GNN layers, these messages update vertex features.In contrast, message aggregation occurs in the final GNN layers, potentially followed by additional transformations for loss computation.For cut size evaluation, each node vertex within the HNR graph is associated with a one-hot vector encoding the partition block allocation of the corresponding node in the original hypergraph, as shown in Fig. 3 (a).Subsequently, every hyperedge vertex combines the partitioning solution vectors from all incoming node vertices, forming the matrix  in Eq. ( 6).After applying the ProdSum operator to , the result is 1 if all corresponding nodes within the original hypergraph's hyperedges are assigned to the same partition block; otherwise, it is 0. Ultimately, the total count of uncut hyperedges is obtained by summing the outcomes of the hyperedge vertices.If we consider the one-hot solution vectors on node vertices as features, then the computation of the uncut hyperedge count can be viewed as message passing within a single-layer GNN.(2) Cut size optimization as backward propagation on HNR: Backward propagation in graph learning is the process of computing gradients with respect to the loss function and back-propagating these gradients through the layers of the GNN to update the model parameters.In the context of differentiable partitioning discussed in Section 4.2, we can relax the one-hot solution vectors on node vertices to continuous and treat them as trainable parameters, as depicted in Fig. 3 (b).By implementing differentiable operators as defined in Eqs.(5)(6)(7)(8)(9), the continuous partitioning solutions can be optimized using backward propagation techniques, akin to the standard GNN training process.This enables the refinement and learning of continuous partitioning solutions in an end-to-end manner.
Leveraging the insights outlined above, we effectively implement the cut size evaluation steps depicted in Lines 4 and 13 of Alg. 2, as well as the GD optimization iterations in Line 10, by harnessing deep graph learning toolkits, such as DGL.
Furthermore, to leverage the parallel computational capabilities of GPUs effectively, we have devised batch-based approaches for both cut-size evaluation and differentiable optimization, as illustrated in Fig. 3. Rather than processing a single partitioning solution at a time, we now handle a batch of solutions concurrently: all  GD trials in Line 5 of Alg. 2 are executed simultaneously.This innovative approach significantly enhances computational efficiency and harnesses the full power of parallel GPU processing.

EXPERIMENTAL VALIDATION
Our graph coarsening is implemented with the fast graph Laplacian linear solver LAMG [12] in Matlab, while the remaining components of MedPart are implemented in Python, leveraging the deep learning toolkits PyTorch and DGL.All experiments are conducted on a server with AMD EPYC 7742 processors and an NVIDIA A100 GPU with 80GB memory.In our evolutionary differentiable algorithm, we configure the number of generations  to be proportional to the logarithm of the vertex count   at the current graph coarsening level.We set the population size  as the maximum population size that our GPU memory can accommodate for a given test case, since all  gradient descent trials will be executed concurrently on the GPU.We set the number of GD steps  to 60.We employ the widely recognized gradient descent solver with momentum, Adam [1], in our differentiable optimization.
We compare MedPart with a leading hypergraph partitioner hMETIS [10] and a state-of-the-art partitioning solution refinement method SpecPart [7] on two sets of publicly-available benchmarks (ISPD98 VLSI Circuit Benchmark Suite [3] and Titan23 Suite [14]).The statistics of these benchmarks are summarized in Table 1 and Table 2, respectively.MedPart operates in two distinct modes: "fromscratch optimization" and "refinement." In the first mode, MedPart conducts optimization from scratch.In the refinement mode, initial partitioning solutions derived from running hMETIS five times, each with different random seeds (as provided in [6]), serve as the starting points for MedPart, which then enhances and refines these solutions.To avoid any possible confusion, we adopt these conventions:  and  ℎ represent the cutsizes of MedPart in "from-scratch optimization" mode and "refinement" mode, respectively.The cut sizes of SpecPart are provided in [7], which refines the solutions obtained from hMETIS [10] and/or KaHyPar [16].

Results on ISPD98 Benchmarks
Table 1 compares the cut sizes obtained by MedPart on the ISPD98 VLSI circuit benchmark with those from hMETIS, SpecPart, and the best-published results.Regardless whether  is 2% or 10%, running MedPart once from scratch consistently outperforms running hMETIS five times with different random seeds.The best-published results for the ISPD98 VLSI circuit benchmark are well-established baselines.MedPart's results exhibit an average gap of approximately 5% for  = 2% and 3.4% for  = 10%, demonstrating their optimality.Additionally, MedPart in its "refinement" mode yields results comparable to the state-of-the-art refinement method, SpecPart.

Results on Titan23 Benchmarks
Table 2 shows results on the Titan23 benchmarks.These are challenging due to many high-degree hyperedges.MedPart significantly outperforms hMETIS, with a 10% improvement for  = 2% and an impressive 25% for  = 20%.In some cases, like sparcT1_core, Med-Part even achieves solutions surpassing the best-published results by up to 30%.Generally, MedPart excels in smaller Titan23 test cases, mainly because GPU memory constraints necessitate a small population size for large hypergraphs, potentially causing premature convergence.To address this, we plan to explore multi-GPU optimization and mixed-precision gradient descent techniques in the future.

Runtime Scalability
Fig. 4 shows the runtime scalability of MedPart in "from-scratch optimization" mode.In general, MedPart runtime scales linearly with the number of hyperedges.The relatively shorter runtime observed in the two largest test cases can be attributed to the premature convergence caused by the use of a small population size.Premature convergence will trigger an early stop in Alg. 2. MedPart, primarily implemented in Python, has substantial room for runtime improvement.The evolutionary differentiable algorithm dominates the runtime of MedPart, while the graph coarsening takes only 11 seconds on the largest benchmark.

Impact of Multi-Level Optimization
We evaluate the impact of the multi-level optimization framework using the top 15 benchmarks from the Titan23 benchmark suite as representative examples.The results, illustrated in Fig. 5, reveal that exclusively running the evolutionary differentiable algorithm at the finest granularity level can result in cut sizes up to 9.6× larger than MedPart.This finding underscores the significant role played by our multi-level optimization framework.[6]. denotes the Specpart result presented in [7], which is obtained by employing SpecPart to enhance partitioning solutions generated by hMETIS and/or KaHyPar.ℎ  5 signifies the best cut size obtained from running hMETIS five times with different random seeds (provided in [6]). and  ℎ5 represent the cut sizes resulting from running MedPart once from scratch and using MedPart to refine the solutions from ℎ  5 , respectively.The best and the second-best results among all the methods are highlighted in red and blue, respectively.denotes the SpectPart cut size presented in [7], which is obtained by employing SpecPart to enhance partitioning solutions generated by running hMETIS 20 times.ℎ  5 signifies the best cut size obtained from running hMETIS five times (provided in [6]). and  ℎ5 represent the cut sizes resulting from running MedPart once from scratch and refining the solutions from ℎ  5 , respectively.We utilize underlining to emphasize the cut sizes achieved by  and  ℎ5 that outperform the  .

CONCLUSIONS AND FUTURE DIRECTIONS
This study presents MedPart, a multi-level evolutionary differentiable hypergraph partitioning framework.Our experiments on public benchmarks consistently show MedPart outperforming the leading partitioner hMETIS, and achieving up to a 30% improvement in cut size compared to the best-published solutions for some benchmarks.We plan to apply our framework to other constraintdriven partitioning problems beyond traditional min-cut and ratio cut bipartitioning formulations.

Figure 1 :
Figure1: Message passing in graph neural networks.A message is computed for each vertex by combining information from its neighboring vertices and applying transformations to the aggregated features.This mechanism will enable efficient loss/objective computation during partitioning.

Fig. 2
Fig. 2 depicts the overview of MedPart.MedPart takes as input a hypergraph and two constraint parameters,  (# partitions) and  (allowed imbalance), and outputs the best partitioning solution.It comprises two key phases: (1) Spectral coarsening on the hypergraph, which progressively reduces the size of the hypergraph and constructs the graph coarsening levels top-down; (2) Coarse-to-fine partitioning, applying bottom-up our evolutionary differentiable

Figure 2 :
Figure 2: Overview of MedPart.(a) Spectral graph coarsening on a hypergraph.The hypergraph transformed to a clique expansion graph is progressively coarsened into several clusters for scalability.Projection matrices encoding the coarsening for use in (b) are built concurrently.(b) Multi-level optimization framework of MedPart.Partitioning assignments at coarser level  are used as a starting point for evolutionary differentiable optimization at finer level  − 1.

Figure 3 :
Figure 3: Batch cut size evaluation and optimization on the Hypergraph-Node Relationship graph.A batch of candidate assignments for each node is aggregated into the hyperedges to calculate objectives.(a) Batch cut size evaluation with discrete node to partition assignments.(b) Batch differentiable cut size optimization with soft probabilistic node to partition assignments.By analogy with deep graph learning, both cut-size evaluation and optimization can be accelerated with deep graph learning toolkits on GPUs.

Figure 5 :
Figure 5: Impact of multi-level optimization on MedPart.The experiments are conducted on the top 15 benchmarks from the Titan23 benchmark suite, with  set to 10%.

Figure 6 :
Figure 6: Cut sizes from MedPart and hMETIS on (a) sparcT1_core and (b) bitonic_mesh, each across 5 runs with different random seeds.

Table 1 :
Statistics of ISPD98 VLSI circuit benchmark suite and cut sizes of different approaches.  represents the best-published cut sizes summarized in

Table 2 :
Statistics of Titan23 benchmark suite and cut sizes of different approaches.  represents the best-published cut sizes. ℎ20 In each case, both methods are executed with different random seeds five times.It can be found from the results that MedPart consistently produces robust outcomes, irrespective of the random seeds used.