Prize-Collecting Steiner Tree: A 1.79 Approximation

Prize-Collecting Steiner Tree (PCST) is a generalization of the Steiner Tree problem, a fundamental problem in computer science. In the classic Steiner Tree problem, we aim to connect a set of vertices known as terminals using the minimum-weight tree in a given weighted graph. In this generalized version, each vertex has a penalty, and there is flexibility to decide whether to connect each vertex or pay its associated penalty, making the problem more realistic and practical. Both the Steiner Tree problem and its Prize-Collecting version had long-standing $2$-approximation algorithms, matching the integrality gap of the natural LP formulations for both. This barrier for both problems has been surpassed, with algorithms achieving approximation factors below $2$. While research on the Steiner Tree problem has led to a series of reductions in the approximation ratio below $2$, culminating in a $\ln(4)+\epsilon$ approximation by Byrka, Grandoni, Rothvo{\ss}, and Sanit\`a, the Prize-Collecting version has not seen improvements in the past 15 years since the work of Archer, Bateni, Hajiaghayi, and Karloff, which reduced the approximation factor for this problem from $2$ to $1.9672$. Interestingly, even the Prize-Collecting TSP approximation, which was first improved below $2$ in the same paper, has seen several advancements since then. In this paper, we reduce the approximation factor for the PCST problem substantially to 1.7994 via a novel iterative approach.


Introduction
The Steiner Tree problem is a well-known problem in the field of combinatorial optimization.It involves connecting a specific set of vertices (referred to as terminals) in a weighted graph while aiming to minimize the total cost of the edges used.The problem also allows for the inclusion of additional vertices, known as Steiner points, which can help reduce the overall cost.This problem has a long history and was formally defined mathematically by Hakimi in 1971 [19].It is recognized as one of the classic NP-hard problems [20].The Steiner Tree problem finds applications in various domains, including network design [4] and phylogenetics [24], prompting continuous research efforts to develop more efficient approximation algorithms.Initial algorithmic strategies for the Steiner Tree problem, while heuristic in nature, set the stage for more precise approaches.Zelikovsky's 1993 introduction of a polynomial-time approximation algorithm achieved an 11/6-approximation ratio [25], which was followed by further improvements including Karpinski and Zelikovsky's 1.65-approximation in 1995 [21].The approach was refined to a 1.55-approximation by Robins and Zelikovsky in 2005 [23], and by 2010, Byrka, Grandoni, Rothvoß, and Sanità advanced this to a 1.39-approximation [12].An earlier MST-based 2-approximation algorithm, introduced in the early 1980s, also played a crucial role due to its simplicity [22].
The computational complexity of the Steiner Tree problem has been firmly established.Bern and Plassmann showed its MAX SNP-hardness, indicating the absence of a polynomial-time approximation scheme (PTAS) for this problem unless P equals NP [8].Building on this, Chlebík and Chlebíková in 2008 established a lower bound, demonstrating that approximating the Steiner Tree problem within a factor of 96/95 of the optimal solution is NP-hard.This finding marks a crucial step in understanding the inherent complexity of the problem [13].
In combinatorial optimization, prize-collecting variants are distinct for their detailed decision-making approach.These variants focus not only on building an optimal structure but also on intentionally excluding certain components, which leads to a penalty.This introduces more complexity and makes these problems more applicable to real-world scenarios.The concept of prize-collecting problems in optimization was first brought forward by Balas in the late 1980s [7].This pioneering work opened a new research direction, particularly in scenarios where avoiding certain elements results in penalties.Following this, the first approximation algorithms for prize-collecting problems were introduced in the early 1990s by authors including Bienstock, Goemans, Simchi-Levi, and Williamson [9].Their initial contributions have significantly shaped the research direction in this area, focusing on developing solutions that effectively balance costs against penalties.
The Prize-collecting Steiner Tree (PCST) problem is a key example in this category, as it takes into account both the costs of connectivity and penalties for excluding vertices.In this problem, we consider an undirected graph G = (V, E) where V represents vertices and E represents edges.Each edge e ∈ E has an associated cost c(e), and each vertex v ∈ V comes with a penalty π(v) that needs to be paid if the vertex is not connected in the solution.The objective is to find a tree T = (V T , E T ) within G that minimizes the sum of edge costs in T and penalties for vertices not in T .This is mathematically expressed as:  This formulation captures the essence of the PCST problem: a trade-off between the infrastructure cost, represented by the sum of the edge costs within the chosen tree, and the penalties assigned to vertices excluded from this connecting structure.This detailed view of the problem applies to various situations, such as network design where not every node needs to be connected, and resource allocation where some demands might not be met, resulting in a cost.Initial strides in developing approximation algorithms for PCST were made by Bienstock, Goemans, Simchi-Levi, and Williamson with a 3-approximation achieved through linear programming relaxation [9].Subsequent advancements by Goemans and Williamson, and later by Archer, Bateni, Hajiaghayi, and Karloff, refined the approximation ratio to 2 and 1.967, respectively [15,6].Our work contributes to the ongoing research efforts in the field by presenting a 1.7994-approximation algorithm for the PCST problem, improving upon the previous best-known ratio of 1.967 established in 2009 [6].This achievement marks progress in enhancing the efficiency of solutions for this long-standing open problem.
Besides PCST, the Prize-collecting Steiner Forest (PCSF) problem stands as another open area of research in combinatorial optimization.In PCSF, the objective is to efficiently connect pairs of vertices, each of which has an associated penalty for remaining unconnected.Work on this area began with the work of Agrawal, Klein, and Ravi [1,2].Following this, 3-approximation algorithms were developed using cost-sharing and iterative rounding, respectively [16,17].Progress continued with Hajiaghayi and Jain's 2.54-approximation algorithm [18], and more recently, the 2-approximation by Ahmadi, Gholami, Hajiaghayi, Jabbarzade, and Mahdavi [3].
Another related problem, the Prize-collecting version of the classic Traveling Salesman Problem (PCTSP), focuses on optimizing the length of the route taken while also accounting for penalties associated with unvisited cities.Although the natural LP formulations for PCTSP and PCST share lots of similarities, PCTSP has experienced considerably more progress.The first breakthrough in breaking the barrier of 2 for PCST also introduced a 1.98-approximation algorithm for PCTSP [6].Subsequently, Goemans improved this to a 1.91 approximation factor [14].The approximation factor was further improved to 1.774 by Blauth and Nägele [11], and most recently, to 1.599 by Blauth, Klein, and Nägele [10].These advances in PCSF and PCTSP underline the significance and continuous research interest in prize-collecting problems.

Contribution Overview
In this paper, we focus on rooted PCST where a designated vertex, denoted as root, must be included in the solution tree.The objective is to connect other vertices to root or pay their penalty.The general PCST and its rooted variant are equivalent.Solving the general PCST involves iterating over all vertices as potential roots and solving the rooted variant for each.Conversely, we can adapt the general version to address rooted PCST by assigning an infinite penalty to the root vertex, ensuring its inclusion in the optimal solution.This two-way equivalence is crucial for our approach, allowing us to concentrate on rooted PCST and extend our findings to the general case.In the rooted version, we define an instance of the PCST problem using a graph G = (V, E, c) with edge weight function c : E → R ≥0 , root vertex root, and penalty function π : V → R ≥0 .In the penalty function, while only non-root vertices have actual penalties, we include root in the domain of π and assume it has penalty π(root) = ∞.This does not affect the actual costs of solutions, but simplifies our statements by adding consistency.
In designing our algorithm, we utilize the recursive approach introduced by [3].The concept involves running a baseline algorithm with a higher approximation factor on PCST to get an initial solution.We then account for the penalties associated with any vertices identified by the baseline algorithm, paying these penalties, and subsequently removing their penalties from consideration.Next, we apply a Steiner tree algorithm to the remaining vertices to obtain another solution.We then call our algorithm recursively with the adjusted penalties.At each recursive step, two algorithms are executed on the current input, each producing a tree as a solution.Our procedure aggregates all solutions generated during the recursion process and selects the one with the lowest cost as the final output.
We give a quick overview of the major components of our algorithm here.
Goemans and Williamson Algorithm for PCST.We use a slightly modified version of the algorithm introduced by Goemans and Williamson in [15] as the baseline algorithm in the recursive process.We briefly present this algorithm for completeness.Throughout the paper, we refer to this algorithm as PCSTGW and denote the solution found by the algorithm as GW.
Let's assume that each edge of the input graph G is a curve with a length equal to its cost.We want to build a spanning tree F, which starts as a forest during our algorithm and transforms into a tree by the end of the algorithm.We then remove certain edges from this tree to obtain our final tree T and pay penalties for every vertex outside T .
To run our algorithm, we define C as the connected components of F, and active sets ActS as subsets of C. Initially, both C and ActS consist of single-member sets, with each vertex belonging to exactly one set.We assign a unique color to each vertex of the graph, with the value π(v) representing the total duration that color v can be used.As π(root) = ∞, the color of root can be used without any limitation.
At any moment, each active set colors its adjacent edges (edges with exactly one endpoint in that set) with the color of one of its vertices that still has available color.
Every time an edge becomes fully colored, it will be added to F, and subsequently, the connected components of F and active sets will be updated.Moreover, if all vertices in an active set run out of color, the active set becomes deactivated and will be considered a dead set, along with all the vertices inside it.We continue this process until all vertices are connected to the root.Note that this is ensured since the root has an infinite amount of color.
After the completion of this process, we remove some edges from F to obtain T .We will select every dead set S that cuts exactly one edge of F and remove all vertices in S from F to obtain T .Every live vertex, which refers to vertices not marked as dead, will be connected to the root in T , along with some dead vertices.In fact, the tree T is the smallest subtree of F that contains all live vertices, including root, and every vertex whose color has been used in T .
Steiner Tree Algorithm for PCST.Here we want to construct a new solution ST based on the outcome of PCSTGW.During the execution of the PCSTGW algorithm, certain active sets and their vertices may reach a dead state, leaving them incapable of coloring edges as their vertices have used all of their colors.In such cases, it is reasonable to pay their penalties and subsequently remove them from consideration.This decision makes sense, as connecting these vertices to other vertices requires excessive costs compared to their penalties.
In the GW solution, some of these dead vertices may eventually connect to the root when other active sets link to them, and we utilize these dead vertices to connect live vertices to root.However, in ST, we pay the penalties of all dead vertices and seek a tree that efficiently connects other vertices to root.The problem of finding a minimum tree that connects a set of vertices to root is known as the Steiner Tree problem, and we employ the best-known algorithm for this, assuming it has a p approximation factor, which currently is ln(4) + ǫ [12].
Improving the approximation factor of the Steiner Tree algorithm would consequently enhance the approximation factor of our PCST algorithm.It's worth noting that one might suggest paying penalties only for vertices that the GW solution pays penalties for, rather than all dead vertices.However, the GW solution may connect all vertices to the root and influence the Steiner Tree algorithm to establish connections for every vertex.This constraint restricts the algorithm's flexibility in exploring alternative tree structures.
Iterative algorithm.Now, let's explore our iterative algorithm.Our aim is to create an iterative procedure that results in a α-approximation algorithm for PCST.We will discuss the value of α in the future.
At the initiation of our algorithm, we divide the vertex penalties by a constant factor β to obtain π β .The idea of altering penalties has been used in [5], but they focus on increasing penalties, while we decrease them.The specific value of β will be determined towards the conclusion of our paper.This determination will be based on the value of p, representing the best-known approximation factor for the Steiner Tree problem, with the goal of minimizing the approximation factor α. Now, we execute PCSTGW using the modified penalties π β .Running PCSTGW on π β provides us with a tree T GW , and paying the penalty of vertices outside T GW yields one solution for the input.Subsequently, we pay the penalty of every vertex that becomes dead during the execution of PCSTGW, set their penalty to zero for the remainder of our algorithm, and connect the remaining vertices using the best-known algorithm for the Steiner tree problem, denoted as SteinerTree.The tree generated by SteinerTree, denoted as T ST , presents another solution for the input.
Then, if no vertices with a non-zero penalty become inactive in PCSTGW, indicating that we haven't altered the penalties of vertices at this step, we terminate our algorithm by returning the minimum cost solution between T GW and T ST .Otherwise, we recursively apply this algorithm to the new penalties, and refer the tree of the best solution found by the recursive approach as T IT .
Finally, we select the best solution among T GW , T ST , and T IT .It's important to note that our algorithm essentially identifies two solutions at each iteration and, in the end, selects the solution with the minimum cost among all these alternatives.
In analyzing our algorithm, we focus on its initial step, specifically the first invocation of PCSTGW and SteinerTree.We categorize vertices based on their status in PCSTGW, distinguishing between those marked as dead or live, and whether their penalties have been paid in both PCSTGW and the optimal solution.Additionally, we classify active sets based on whether they color only one edge or more than one edge of the optimal solution.Through this partitioning, we derive lower bounds for the optimal solution and upper bounds for the solutions T GW and T ST .Leveraging the recursive nature of our algorithm, we establish an upper bound for the solution T IT using induction.Following that, we evaluate how much these solutions deviate from α • cost OPT .
Next, we show that for β = 1.252 and α = 1.7994, a weighted average of the cost of the three solutions is at most α • cost OPT .This shows that our algorithm when using this value of β is a 1.7994 approximation of the optimal solution since the minimum cost is lower than any weighted average.We note that throughout our analysis, we do not know the value of α.Instead, we obtain a system of constraints involving α, β, p, and the weights in the weighted average which needs to be satisfied in order for our proof steps to be valid.Then, we find a solution to this system minimizing α to find our approximation guarantee.In this solution, we use p = ln(4) + ǫ, using the current best approximation factor for the Steiner tree [12].Finally, we explain the intuition behind certain parts of our algorithm, including why we need to consider all three solutions that we obtain.
Outline.In Section 2, we explain Goemans and Williamson's 2-approximation algorithm for PCST [15], using the coloring schema effectively utilized by [3] for PCSF.Then, in Section 3, we present our iterative algorithm along with its analysis.Finally, in Section 4, we highlight the importance of employing both algorithms in conjunction with the iterative approach to improve the approximation factor.

Preliminaries
Throughout our paper, we assume without loss of generality that the given graph is connected.
Let T be a subgraph, then c(T ) denotes the total cost of edges in T , i.e., c(T ) = e∈T c(e).
For a subgraph T , we use V(T ) to represent the set of vertices in T , and V(T ) denotes the set of vertices outside T .Given a subset of vertices S ⊆ V, we define π(S ) = v∈S π(v) as the sum of penalties associated with vertices in S .
For a PCST solution X, we denote its corresponding tree as T X .Furthermore, we use cost X to represent the total cost of X, defined as c(T X ) + π(V(T X )).

Goemans and Williamson Algorithm
Here we define a slightly modified version of the algorithm initially proposed by Goemans and Williamson in [15] (hereinafter the GW algorithm) for the sake of completeness of our algorithm.Then we use it as a building block in our algorithm in the next section.We introduce several lemmas stating the properties of the algorithm and its output.We defer the proofs of these lemmas to the appendix.
The algorithm consists of two phases.In the first phase, we simulate a continuous process of vertices growing components around themselves and coloring the edges adjacent to these components at a constant rate.In this process, we imagine each edge e with weight c(e) as a curve of length c(e).Each vertex v has a potential coloring duration equal to its penalty π(v).We assume that the root vertex root has π(root) = ∞, indicating infinite coloring potential.This process of coloring will give us a spanning tree, which we will then trim in the second phase to get a final tree.
During the algorithm, we keep a forest F of tentatively selected edges, a set C of connected components of this forest, and a subset ActS of active sets in C. For each component S in C, we will also store its coloring duration y S .Initially, the forest F is empty, every vertex is an active set in C, and all y S values are 0.
At any moment in the process, all active sets color their adjacent edges using the coloring potential of their vertices at the same rate.So, the amount of color on each edge is the total duration its endpoints have been in active sets.We define an edge as fully colored if the combined active time of its endpoints totals at least the length of the edge while they belong to different components.When such an edge between two sets becomes fully colored, it is added to F, and the two sets containing its endpoints are merged, with their coloring potentials summed together.An active set becomes inactive if it runs out of coloring potential.This means that this set and its subsets have used the coloring potential of all the vertices in the set.We call an edge getting added to F or an active set becoming inactive events in the coloring process.It may be possible for multiple events to happen simultaneously, and in that case, we would handle them one by one in an arbitrary order.The addition of one edge in the order may prevent the addition of other fully colored edges.However, this can only happen if the latter edge forms a cycle in F, and therefore, the resulting components are independent of the order in which we handle the events.As the component containing root remains active and edges are only added between different components, F will eventually become a spanning tree of G.This marks the completion of the coloring phase.
In the second phase, we will select a subset of F as our Steiner tree and pay the penalties for the remaining vertices.We refer to any active set that becomes inactive as a dead set.Throughout the first phase, we maintain dead sets in DS to utilize them in the second phase.We categorize vertices into dead and live, where a dead vertex is any vertex contained in at least one dead set, and all other vertices are considered live.We store dead vertices in K and return them at the end of PCSTGW since they are used in our iterative algorithm in the next section.For any dead set S , if there is exactly one edge of F cut by S (i.e., |δ(S ) ∩ F| = 1), we remove this edge and all the edges in F that have both endpoints in S .This effectively removes S from the tree and disconnects its vertices from the root.We repeat this process until no dead set with this property can be found.Figure 1 illustrates how dead sets may be removed.
As each operation in the second phase disconnects only the selected dead set from the root, the final result will be a tree T that contains all the live vertices, including root.We pay the penalties for the vertices outside the tree, which are all dead vertices belonging to the dead sets we removed in the second phase.Algorithm 1 provides a pseudocode that implements this process.
To facilitate our analysis throughout the paper, we assume that each vertex is associated with a specific color.During the coloring process of an active set S , we assign each moment of coloring to a vertex v ∈ S with non-zero remaining coloring potential and utilize its color on the adjacent edges.For consistency, we choose vertex v based on a fixed ordering of the vertices in V where root comes first.So, a set S containing root will always assign its coloring to root.We note that a set S can not use the color of a vertex that is already dead.Based on this assignment, we define the following values: For each vertex v, we define its total coloring duration y v , and the coloring duration assigned to it by a set S as y S v : • y S v = total coloring duration using color v in set S Note that v∈S y S v = y S .
We bound the cost of both the chosen tree and the penalty of the dead vertices in the following lemmas.
Proofs of these lemmas are in the appendix given their similarity to [15].
Lemma 2. Let T be the tree returned by Algorithm 1.We can bound the total weight of this tree by Lemmas 2, 3, and 4 immediately conclude the following lemma.
Lemma 5.The total cost of the GW algorithm is bounded by We note that Lemma 5 can be used to prove that the GW algorithm achieves a 2-approximation by showing that the optimal solution has cost at least v∈V−{root} y v .We prove a stronger version of this fact in Lemma 13.
In addition to the above lemmas on the cost of the solution and its connection to the coloring, we also prove the following lemma.This lemma will help in our analysis in Section 3.1, where we use it to introduce an upper bound for the cost of the optimal Steiner tree connecting all the live vertices in a call to the GW algorithm.
Lemma 6.Let I = (G, root, π) and I ′ = (G ′ , root, π) be instances of PCST, where G ′ is obtained from G by adding a set of edges E 0 with weight 0 from root to a set of vertices U. Let y v be the coloring duration for vertex v in a run of the GW algorithm on I, and let K be the set of dead vertices in this run.Let y ′ v and K ′ be the corresponding values when running the GW algorithm on instance I ′ using the same order to assign coloring duration to vertices.We have Proof.We will prove these facts by comparing the run of the GW algorithm on instances I and I ′ .We can identify "moments" in the first phase of the algorithm in these runs by the total coloring duration using the color of root which is always active, and look at the same moments across these two runs.Let C and C ′ refer to the set of components in the runs on I and I ′ respectively.We prove the invariant that at any moment in the run of the GW algorithm on I ′ , for any component S ∈ C ′ such that root S , S would also be a component in C at the same moment of the algorithm on instance I.In addition, S would be active for instance I ′ if and only if it is active in instance I. Figure 2 illustrates how the components in the runs can look.
Initially, at moment t = 0, before any events are applied, the invariant holds as we start with each vertex being an isolated component in both cases.The invariant also holds after events at t = 0 are processed: We can assume that the shared events are handled first in both runs, with the second run also having additional events corresponding to the edges in E 0 being fully colored, which will only merge components with the component containing root.
We will now prove that if the invariant holds at moment t, it will also hold at the next moment t ′ > t where an event happens in the second run.Combined with the invariant being true at time t = 0, this will prove the invariant for the duration of the algorithm as the invariant can only break when an event occurs.Note that unless otherwise specified, when referring to a moment t, we consider the state of the runs after the events at moment t have been applied.
Let t be the current moment, where we know the invariant holds.Let t ′ be the first moment after t when an event happens in the run for instance I ′ .We first claim that between t and t ′ , there can be no events in the run for I that affect a component S ∈ C ′ not containing root.Assume otherwise that such an event exists and the first event of this kind occurs at moment t ′′ < t ′ .There are two possible cases: • The event corresponds to a fully colored edge getting added.One of the endpoints of this edge must be in set S .Let S ′ be the set in C ′ containing the other endpoint.If root S ′ , then at each moment until t ′′ , the component containing each endpoint has been the same between I and I ′ .In addition, these components have been active at the same moments.So, the amount of coloring on the edge is the same in both runs, and this edge should become fully colored in the run on I ′ at time t ′ too.This is in contradiction with the fact that the first event after moment t for I ′ is at time t ′ > t ′′ .
We arrive at the same contradiction if S ′ includes root.In this case, the coloring on the edge would have been the same in both runs until the other endpoint joins a component including root.Afterward, the coloring from the endpoint in S would be the same between the two runs, and the other end is always in an active set.So, the coloring on this edge in I ′ at moment t ′′ is at least as much as in I and so it must be fully colored by t ′′ .This also can't happen before t since S is a component in C ′ , so we again get an event between t and t ′ which is a contradiction.
• The event corresponds to the set S becoming inactive.Since S and its subsets have been active sets at the same moments in both runs, if S becomes inactive in I at time t ′′ it will also become inactive in I ′ at the same moment as a set becoming inactive only depends on the coloring duration of its subsets.This contradicts our assumption of the first event for I ′ occurring at t ′ > t ′′ .
This shows that the invariant holds just before t ′ .We now show that events at t ′ will not break this invariant.
We note that multiple events may happen at the same moment, but as previously mentioned the order of considering the events does not change the final components.So, we assume that relevant events are taken in the same order in both runs and consider the effect of events at time t ′ one at a time.There are again two cases for the event: • The event corresponds to an edge becoming fully colored.Let S and S ′ be the components in C ′ containing the endpoints of this edge.If neither set contains root, then the same components contain these endpoints in C, and by the same argument as the previous case, this edge becomes fully colored at time t ′ in the run for I. So, we can add the edge in both runs, and the invariant will still hold for the new components.Otherwise, since the merged component will contain root, its addition to C ′ does not affect the invariant and it will again hold.
• The event corresponds to a set S becoming inactive.This set cannot contain root as the set containing root has unlimited potential and never becomes inactive.So, by the same argument we used previously, this set must also be in C and become inactive at the same time.
This proves that the desired invariant will hold at all moments in the run for I ′ .Now, let y ′ S , y ′ S v , and y ′ v denote the coloring duration values for this run and y S , y S v and y v be the same values for the run on I. Based on the above invariant, we will show that y ′ v ≤ y v for all vertices v root.For any non-root vertex v, before it joins a component containing root in the run on I ′ , it belongs to the same component in both runs at any moment.In addition, these components will be active sets at the same moments.This is also true for all the vertices that are in the same component as v in any of these moments.So, the y S and y ′ S values for these components up until this moment will be identical.Consequently, the y ′ S v values and therefore y ′ v will also be equal to their counterparts in the other run as the same ordering is used to assign coloring duration.After this moment, y ′ v will not increase anymore, as all coloring for the component will be assigned to root, and y v can only increase further.So, y ′ v ≤ y v for all v root.We can also show that y ′ root ≤ y root .Consider the moment the run on I ends.At this moment, the only component in C is the one containing root.Based on the invariant, we can infer that this is also the only component in C ′ .So, the run on I ′ ends at least as soon as the run on I.But as the component containing root is always active and assigns its coloring to root, the y value for the root is exactly the total duration of the process.So, y ′ root ≤ y root .In addition, as any vertex v ∈ U can immediately merge with root using the added 0-weight edge, we will have y ′ v = 0 for these vertices.
We can see from our proof of the invariant that any dead set in the run on I ′ will also be a dead set in the run on I. Therefore, K ′ ⊆ K.This completes our proof of the lemma.
Output: Subtree T of G containing root, alongside a set K of dead vertices.
1: procedure PCSTGW(I = (G, root, π)) Initialize F as an empty forest for S ∈ ActS do 12: Find a set S minimizing ∆ 1 15: Extract T from F by repeatedly removing dead sets in DS that cut a single edge in F 23: return (T, K) 3 The Iterative Algorithm In this section, we present our iterative algorithm which is described in Algorithm 2. In Section 3.1 we give an analysis for this algorithm.
Our algorithm makes use of the PCSTGW procedure from Algorithm 1 as a fundamental component.Additionally, we employ an approximation algorithm for the Steiner tree problem to improve the approximation factor.This can be any approximation algorithm for the Steiner tree problem.We denote the approximation factor for this algorithm as p. Whenever we require this p-approximation solution for the Steiner tree, we invoke the procedure named SteinerTree.As our final approximation factor will depend on p, we will use the current best approximation algorithm for Steiner Tree [12] with p = ln(4) + ǫ in our analysis.In addition, our algorithm depends on a constant β which we will fix later in Section 3.2 to optimize the approximation ratio.
Our algorithm, as described in Algorithm 2, identifies three solutions for the given PCST instance I = (G, root, π).Subsequently, we opt for the solution with the minimum cost as the final solution for instance I.
First, we construct the instance I β = (G, root, π β ) from I by replacing π v with π v β for all vertices.One solution named "GW" for instance I, denoted as T GW , can be obtained by invoking procedure PCSTGW(Line 4) on instance I β , buying edges in T GW and paying penalties for vertices in V(T GW ).From the definition of I β , we can conclude that π(V(T GW )) = βπ β (V(T GW )).As stated in Section 2, in addition to T GW , procedure PC-STGW also returns a set of vertices, K, which represents dead vertices during the coloring process.
Another solution for instance I named "ST" is obtained by retrieving a Steiner tree T ST in graph G for the set of terminals L ≔ V \ K which are the live vertices in the output of the GW algorithm.This solution is found using the procedure SteinerTree and is therefore a p-approximation of the minimum Steiner tree on this terminal set.We pay the penalties for the vertices outside T ST , which will be a subset of K.
If K is empty, the algorithm immediately returns the solution with the lower total cost between the two obtained solutions.Otherwise, a third solution named "IT", denoted as T IT , is obtained through a recursive call on a simplified instance R. The simplified instance is formed through a process of adjusting penalties.We set the penalties for the vertices in K, which are the dead vertices in the result of the PCSTGW procedure, to zero while maintaining the penalty for the live vertices L, as indicated in Lines 11 through 12.
As a final step, the algorithm simply selects and returns the solution with the lowest cost.To help with the comparison of these three solutions, the algorithm calculates the values cost GW = c(T GW ) + π(V(T GW )), cost ST = c(T ST ) + π(V(T ST )), and cost IT = c(T IT ) + π(V(T IT )), representing the costs of the solutions (as indicated in Lines 5, 8, and 14).

Analysis
For an arbitrary instance I = (G, root, π) in PCST, our aim is to analyze the approximation factor achieved by Algorithm 2. We compare the output of IPCST on I with an optimal solution OPT for the instance I.We denote the tree selected in OPT as T OPT .Then, the cost of OPT is given by cost OPT = c(T OPT ) + π(V(T OPT )).
We use an inductive approach to analyze the algorithm, where we focus on a single call of the algorithm and find upper bounds for each of our three solutions and a lower bound for the optimal solution OPT.To find these lower and upper bounds, we make use of the coloring done by the GW algorithm on instance I β and the values y S , y S v , and y v relating to this coloring process.In addition, we establish an upper bound for the solution obtained from the recursive call based on the induction hypothesis.In our inductive analysis, we only consider one individual call to the procedure at each time, to analyze either the induction base or the
Output: Subtree T of G containing root.

4:
T GW , K ← PCSTGW(I β ) 5: cost GW ← c(T GW ) + π(V(T GW )) 6: T ST ← SteinerTree(G, L) if π(K) = 0 then 10: return T X where cost X is minimum among X ∈ {GW, ST} 11: Construct π ′ by adjusting π through the assignment of penalties for vertices in K to 0. 12: Construct the PCST instance R = (G, root, π ′ ). 13: return T X where cost X is minimum among X ∈ {GW, ST, IT} induction step.So, all the variables used in the analysis will relate to the algorithm's variables in the specific call we are analyzing.This includes the trees T GW , T ST , and T IT , and the live and dead vertices L and K.
We note that in our induction, we do not initially know the value of the approximation factor α which we want to prove the algorithm achieves.Instead, we use α as a variable in our inequalities, and this leads to a system of constraints involving α that need to be satisfied for our induction to prove an α approximation guarantee.These inequalities involve not only the approximation factor α which we seek to find but also the parameter β which defines the behavior of our algorithm.Throughout the analysis, we assume that β ≤ 2.
We justify this assumption in Subsection 4.1 by showing that values of β > 2 cannot lead to a better than 2 approximation.To determine our approximation factor α, we consider the range p ≤ α ≤ 2. This range is chosen because we cannot assume that our algorithm performs better than the Steiner tree algorithm, which we use as a component.Additionally, our solution is guaranteed to be at least as good as the 2-approximation provided by the GW algorithm.
In the first step, we categorize non-root vertices based on the output of PCSTGW(I β ) and OPT.This categorization helps us establish more precise bounds for the solutions by enabling a more tailored analysis within each category.
Definition 7.For an instance I, OPT partition vertices into two sets: V(T OPT ) and V(T OPT ).PCSTGW(I β ) also partitions vertices into two sets: L and K.We define four sets to categorize the vertices, excluding root, based on these two partitions: Using the coloring scheme of PCSTGW(I β ), we introduce the following values to represent the total duration of coloring with vertices in these sets.
Subsequently, we define r B ′ , r B ′′ , r D ′ , and r D ′′ as the total duration of coloring with vertices in sets B ′ , B ′′ , D ′ , and D ′′ , respectively.set to b 2 .These definitions are as follows: Note that r B = b 1 + b 2 , as every vertex in B is connected to root in the optimal solution.Therefore, with each moment of coloring involving vertices in B, the corresponding active set cuts an edge belonging to the path from that vertex to root in the optimal solution.
Lemma 10.For any vertex v ∈ V, we have βy v ≤ π(v).Furthermore, if v ∈ B ∪ D, which means is a dead vertex, we have βy v = π(v).
Proof.Since we run PCSTGW on π β in Line 4, we can use Lemma 3 using penalties π β .That means, for any vertex v ∈ V, we have y v ≤ π β (v), and if v is a dead vertex, we have y v = π β (v).Since in Line 2, we set , we can conclude the lemma.Now for a given instance I, we derive lower bounds on the optimal solution using terms defined earlier.We use a similar approach that is used in [3] to bound the optimal solution.
Lemma 11.We can bound the cost of the optimal solution in terms of the cost of its tree as follows: Proof.According to the definition of cost in PCST, we can determine the cost of the optimal solution by separately calculating the weight of its tree and the penalties it pays.Additionally, based on Definition 7, we have V(T OPT ) = C ∪ D. Utilizing these two observations, we can establish an upper bound for cost OPT as follows: Based on Lemma 11, we can easily conclude the following corollary which bounds the weight of the optimal solution tree using the cost of the optimal solution.
Corollary 12.We can bound the cost of optimal solution's tree as follows: Now we use Lemma 11, to expand the bound of the optimal solution.
Lemma 13.We can establish a lower bound for the optimal solution as follows: Proof.First, we demonstrate that r A + b 1 + 2b 2 is a lower bound for c(T OPT ).To achieve this, for any set S , we define d OPT (S ) as the number of edges of T OPT that are colored by S .Given that each portion of an edge will be colored at most once, and each set S ⊆ V colors d OPT (S ) • y S of the optimal solution, we can derive a lower bound for c(T OPT ) based on the proportion of the colored edges in T OPT .
(change the order of summations) Furthermore, for any vertex v in A or B, based on Definition 7, there exists a path from v to root in T OPT .Also, for every set S ⊆ V where y S v > 0, we know root S otherwise all coloring of set S would be assigned to root.Using these two observations, we can infer that at least one edge of T OPT is colored by S , resulting in d OPT (S ) ≥ 1.
For vertices in A, we have: For vertices in B, we have: Combining all together, we obtain: By using this bound along with Lemma 11, we can bound cost OPT .
Next, we bound the GW solution.
Lemma 14.The following bound holds for the cost of the solution returned by the output of PCSTGW(I β ) for instance I: Proof.According to Line 5 of Algorithm 2, we have To start, based on Definition 7 we have and based on Definition 8 we have Then, we can combine them to obtain Applying this observation to Lemma 2 results in a bound for c(T GW ).
(Definitions 7 and 8) Additionally, in GW, we pay penalties for the vertices that are not connected to the root, all of which are dead according to Lemma 4. Consequently, we can deduce that: βy v (Lemma 10) It is worth emphasizing that throughout the algorithm, we assume β ≤ 2. In conclusion, we can establish an upper bound for cost GW .
We restate this upper bound in terms of the variable α and the cost of the optimal solution cost OPT using Lemma 13.
Lemma 15.The following bound holds for the cost of the solution returned by the output of PCSTGW(I β ) for instance I: Proof.We can directly apply Lemma 13 to the previous bound obtained in the preceding Lemma 14.
Next, we bound the cost of the ST solution.For a set S , let T OPT ′ S denote the minimum cost Steiner tree on this set.In the following lemma, we relate the cost of the ST solution to the cost of T OPT ′ L .
Lemma 16.For instance I, we can bound the cost of the solution returned by the output of ST as follows: Proof.Since in T ST , we are connecting every vertex in L to root, using an Steiner tree algorithm with an approximation factor of p, the cost of the tree T ST can be bounded by Moreover, as all vertices in L are connected to root, the vertices for which we need to pay penalties for this solution form a subset of K, i.e., V(T ST ) ⊆ K. Furthermore, by Definition 7 we have: = K Now, we can bound the penalty paid by the ST solution.
Finally, we use these bounds to complete the proof We now provide an upper bound for the cost of T OPT ′ L based on the cost of T OPT to obtain our main upper bound for ST.

Lemma 17. For the minimum cost Steiner tree T OPT ′
L on L, we have Proof.We construct a new instance where G ′ is obtained from G by adding a set E 0 of edges of weight 0 from root to every vertex in U = A ∪ B = V(T OPT ) − {root}.Let T ′ GW be the resulting tree and y ′ v be the coloring duration for the vertices in this process assuming we assign the colors in the same way as we did when running the GW algorithm on I β .By Lemma 6, y ′ v ≤ y v for all vertices in C ∪ D. In addition, we have y ′ v = 0 for all vertices in U = A ∪ B. Then, using Lemma 2 we can bound the cost of Let K ′ be the set of dead vertices returned by the GW algorithm on I ′ β .Based on Lemma 6, we have K ′ ⊆ K. Therefore, as vertices in A ∪ C ∪ {root} = L are not part of K, they cannot be part of K ′ either and must be live vertices in this run.Lemma 4 means that these vertices are connected by T ′ GW .If we remove any edges in E 0 from T ′ GW , and instead add T OPT , which is a spanning tree on A ∪ B ∪ {root}, all the vertices in V(T ′ GW ) will remain connected.So, we get a connected subgraph of G that connects L. The cost of this subgraph is at most As this subgraph connects L, its cost gives us an upper bound on the cost of the minimum Steiner tree on these vertices.So we have c(T OPT ′ L ) ≤ c(T OPT ) + 2r C + 2r D .
We combine the last two lemmas to introduce an upper bound for the ST solution.We again state this upper bound in terms of cost OPT and α.Here, we rely on the fact that α ≥ p to add a non-negative value to an initial upper bound based on Lemmas 16 and 17.
Lemma 18.For instance I, we can bound the cost of the solution returned by the output of ST as follows: Proof Now, assume that we want to show that the algorithm achieves an approximation factor of α.Then, to prove this by induction, we need to show two things.First, we need to show that in the base case where the dead set K returned by the GW algorithm has penalty 0 and we do not make a recursive call, our solution is an α approximation.Secondly, we have to demonstrate the induction step.This means that we have to show that if our recursive call on instance R returns an α approximation for this instance, the final returned solution will also be an α approximation.If these two steps are accomplished, then by induction on the number of vertices with non-zero penalties (which decreases with every recursive call), we can prove that our algorithm achieves an α approximation.
So far, we do not know the value of α so we cannot prove the induction steps directly.Instead, we will show that if α satisfies certain constraints then both the base case and the step of induction can be proven for that value of α and therefore our algorithm will give us an α approximation.These constraints are obtained by thinking of α as a variable and then trying to prove the induction base and the induction step for α.
Minimizing α in this system of constraints will give us an upper bound on the approximation factor of our algorithm.
In the following, we first assume that the recursive call on R is an α approximation, and bound the iterative solution using this assumption.Then, in Section 3.2 we combine the bounds for the different solutions to find a system of constraints that restrict α.We also consider the constraints that arise from the base case being an α approximation, which turn out to form a subset of the former constraints.Finally, we find the minimum value of α that can satisfy these constraints to obtain our approximation guarantee.
We start with the next lemma, which bounds the cost of the iterative solution's output, assuming that the recursive call returns an α approximate solution for instance R. Here, OPT R denotes the optimal solution for the PCST instance R.
Lemma 19.For instance I, the cost of the iterative solution, denoted as cost IT , can be bounded as follows: assuming that the recursive call on instance R returns an α approximate solution.
Proof.Based on our assumption, IPCST(R) will return a solution that is an α-approximate of the optimal solution of instance R which we indicate by OPT R .This gives us the following bound: However, as cost IT = c(T IT ) + π(V(T IT )), we need to establish the relationship between π(V(T IT )) and π ′ (V(T IT )).The only difference between these functions lies in setting the penalty for vertices in K = B ∪ D to zero in π ′ , as indicated in Line 11.Thus, we can conclude that By combining these inequalities, we get Lemma 20.For an instance I, we can remove a set of edges with a total length of b 1 from T OPT in such a way that the vertices in A remain connected to root.
Proof.Consider a moment of coloring with the color of a vertex v ∈ B in a single-edge set S ⊆ V. Given that we are coloring with v at this moment, the vertex is still a live vertex.However, since v is in B, it will become dead at some moment of the algorithm.Since all the vertices in S will remain in the same component until the end of the algorithm, the moment v becomes dead, all vertices in S will also become dead.That means, every vertex in S is either in B or D, i.e. S ⊆ B ∪ D = K.
Since S is a single-edge set, there is only one edge from T OPT that cuts this set.Let assume that this edge is e, i.e. δ(S ) ∩ T OPT = {e}.Removing edge e from T OPT , will only disconnect vertices in S from root, since S is a single-edge set and paths in T OPT from root to vertices outside of S will not pass through e.
If we remove all such edges from T OPT , the total cost of the removed edges will be at least b 1 .This is due to the fact that the coloring on these edges from single-cut sets assigned to the vertices in B is equal to b 1 , and the coloring on each edge is at most its weight.Note that, each single-edge set is coloring exactly one edge of the optimal solution at each moment.So, we can remove edges with a total length of at least b 1 from T OPT without disconnecting vertices in A from root.
Lemma 21.For an instance I, we can bound the cost of the optimal solution for instance R by where R is created at Line 12 of IPCST(I).
Proof.To prove this lemma, we start by showing that there is a solution for instance R that costs at most cost OPT − βr D − b 1 .Since OPT R is the optimal solution of instance R, its cost would not exceed the cost of the instance we are constructing.This will complete the proof of the lemma.To construct the mentioned instance, we take the optimal solution of instance I, which we indicate by OPT, and remove extra edges from its tree T OPT .Additionally, we do not need to pay penalties for pairs in OPT whose penalty is set to zero in π ′ at Line 11 for instance R.
Let's start with the tree T OPT .Using Lemma 20, we can remove a set of edges from T OPT with a total length of at least b 1 without disconnecting vertices in set A from root.
Moreover, the optimal solution pays penalties for vertices in set C ∪ D. However, instance R has been constructed by assigning zero to the penalty of vertices in set K, which includes vertices in set D. Therefore, the penalty that we pay for vertices in D in the optimal solution is not required to be paid in OPT R .This deducts π(D) from the cost of the optimal solution, which is equal to βr D according to Lemma 10.This completes the proof of this lemma.
Lemma 22.For instance I, the output of the iterative solution can be bounded as follows: assuming that the recursive call on instance R returns an α approximate solution.
Proof.We utilize Lemma 21 to modify the terms of the bound in Lemma 19.

Finding The Approximation Factor
Now that we have bounded cost GW , cost ST , and cost IT , we can determine an appropriate value for α such that, during each call of IPCST on instance I, the minimum of cost GW , cost ST , and cost IT is at most α • cost OPT .
To achieve this, we assign weights to each solution in a way that the weighted average of these three bounds is at most α • cost OPT .This completes our proof and demonstrates that the minimum among them is at most α • cost OPT since any weighted average of a set of values is greater than or equal to their minimum.
Denoting w GW , w ST , and w IT as the weights of solutions GW, ST, and IT respectively, let cost WAG represent their weighted average cost.As we are taking an average, we assume w GW + w ST + w IT = 1 to simplify the calculation.We also have w GW , w ST , w IT ≥ 0. The bound for the weighted average is then given by Thus, the first term in the expression is α • cost OPT .
To ensure cost WAG ≤ α • cost OPT , we aim to make the rest of the expression non-positive.Since r A , b 1 , b 2 , r C , and r D are non-negative values, it suffices to make their coefficients non-positive by assigning suitable values to α, β, and the weights w GW , w ST , and w IT .This leads to finding values that satisfy the following inequalities, with each inequality corresponding to one of the coefficients. ( We can also use a weighted average to ensure that our solution in the induction base has cost ≤ α • cost OPT .In this case, the IT solution cannot be employed as it represents the final step of recursion.So, we must have w IT = 0. Additionally, it's essential to note that in this step, π(K) = π(B ∪ D) = 0, resulting in b 1 = b 2 = r D = 0. Thus, only the inequalities for the coefficients of r A and r C remain relevant, which already do not contain w IT : We can see that if a solution for the system of constraints used for the induction step is found, setting w IT = 0 and scaling w GW and w ST by a factor of 1 1−w IT gives us a solution for these two new constraints with w GW + w ST = 1 and w IT = 0. So, whatever values of α and β we find by solving the initial system of inequalities will give us a valid solution and an approximation guarantee of α.
Considering the best-known approximation factor for the Steiner tree problem, which is p = ln(4) + ǫ [12], we determine that choosing the values α = 1.7994, β = 1.252, w GW = 0.385, w ST = 0.187, and w IT = 0.428 satisfies all the inequalities for a small enough value of ǫ.This provides a valid proof for both the induction base and induction step, leading to the conclusion of the following theorem.
Theorem 23.The minimum cost among GW, ST, and IT is a 1.7994-approximate solution for the Prize-Collecting Steiner Tree instance I. Therefore, IPCST is an 1.7994 approximation for PCST.
Finally, we note that our algorithm runs in polynomial time.
Theorem 24.The procedure IPCST(I) runs in polynomial time.
Proof.The procedure IPCST(I) calls the PCSTGW(I β ) which runs in polynomial time and a polynomial time algorithm SteinerTree for the Steiner tree problem.Then it recursively calls itself on a new instance such that the new instance has more vertices with a penalty of 0. The construction of this instance involves a simple loop on the vertices and is done in polynomial time.Since the number of vertices is |V|, and each time the number of vertices with non-zero penalty decreases by one, the recursion depth is at most |V|.So, in IPCST we have a polynomial number of recursive steps, and each step takes a polynomial amount of time.Therefore, the total running time of the algorithm is polynomial in the size of the input.

Necessities in Our Algorithm
In this section, we demonstrate the necessity of utilizing all three solutions in IPCST and selecting the minimum among them.Table 2 is completed based on the constraints 1 < p < α < 1.8, derived from the NP-hardness of finding an exact algorithm for Steiner tree, the fact that Steiner tree is a special case of PCST, and the goal of achieving an approximation factor better than 1.8.Additionally, we select β such that 2/α ≤ β ≤ α because if 2/α > β, both coefficients in r C will be positive.Also, if β > α, all coefficients of b 1 become positive.Table 2 demonstrates the sign of the coefficient for each variable in every algorithm.We refer to this table to explain why all three algorithms are essential.We need to find a combination of these algorithms such that the weighted average of these coefficients adds up to zero.Since each row associated with an algorithm has at least one positive value, achieving this balance is not possible if we use only one of the algorithms.Moreover, omitting IT results in a positive coefficient for b 1 , making the iterative approach necessary.Similarly, using GW and IT together leads to a positive coefficient for r A , emphasizing the need for ST to offset it.Lastly, if we drop GW, the coefficient of b 2 constrains our approximation factor, as its coefficient in the IT algorithm is positive, and in the ST algorithm, it is 2p + β − 2α.Given that the best-known approximation factor for  Figure 4: A star graph with n + 1 vertices.We construct a PCST instance on this graph with vertex r as the root, the central vertex c having penalty 0, and all other vertices with having penalty 2(1 + 1 n−1 ).
the Steiner tree is ln(4) + ǫ [12], replacing p with ln(4) + ǫ results in a positive value for the coefficient of b 2 in the ST algorithm.Therefore, the GW algorithm is necessary to decrease the coefficient of b 2 .

Bad example for β > 2
Let β = 2(1 + ǫ) for some ǫ > 0. We consider a star graph G with n + 1 vertices as shown in Figure 4, where one vertex is a central vertex and all other vertices are connected to this vertex with edges of length 1 for some value of n such that 1 n−1 < ǫ.We construct an instance of PCST on this graph where one of the non-central vertices is the root, the central vertex has penalty 0, and any other vertex has penalty 2(1 + 1 n−1 ).When we run the GW algorithm on this instance, the center vertex dies instantly as it has 0 coloring potential.Additionally, as 1  n−1 < ǫ, each non-root leaf has coloring potential and therefore dies before reaching the central vertex.So, the GW solution will pay penalty (n − 1)(2(1 + 1 n−1 )) = 2n.This is twice the cost of the optimal solution, which can be obtained by taking all n edges of length 1.The other solutions we consider will also have the same cost as the GW solution, as they will aim to connect only the root and will pay the penalties for all the dead vertices.So, using any β > 2 will lead to an approximation factor of at least 2.

Figure 1 :
Figure1: Illustration of dead sets in the final tree of GW algorithm.The dead sets colored in blue cut multiple edges of F, and removing them would disconnect other vertices so they are not removed.On the other hand, the dead sets colored in red can be safely removed without affecting other vertices.

rootFigure 2 :
Figure 2: An illustration of how the components in C and C ′ can be.The components in C ′ are shown in red circles, and the components in C are shown in blue ones.Each red component that does not include root is also a blue component.

Definition 8 (
Connected and unconnected dead vertices).For an instance I, based on Definition 7, the sets B and D represent dead vertices in the output of PCSTGW(I β ).We further divide set B into B ′ and B ′′ , and set D into D ′ and D ′′ , based on whether they are connected to the root at the end of the PCSTGW(I β ) procedure.Let B ′ and D ′ be the subsets of B and D, respectively, representing the vertices connected to the root.Similarly, B ′′ and D ′′ are the subsets of B and D, respectively, indicating the vertices not connected to the root at the end of the procedure.

Figure 3 :
Figure 3: Illustration of single-edge set vs. multi-edge set in T OPT .The red set is a single-edge set, but the blue one is a multi-edge set.

r A b 1 b 2 r 2 :
C r D GW + Sign of coefficients of each solution.

Table 1 :
vertices 1 Dead vertices This table illustrates the categories of vertices.