SFTC: Machine Unlearning via Selective Fine-tuning and Targeted Confusion

As the importance of data privacy escalates in the modern digital era, machine learning service operators face challenges posed by the stringent privacy regulations, such as the GDPR. To cope with these challenges, the concept of machine unlearning emerges as a key solution that meets data removal requirements, while maintaining trust and transparency, thereby reducing the risk of data breaches. In this work, we present a Selective Fine-tuning and Targeted Confusion (SFTC) algorithm for machine unlearning. SFTC simultaneously performs fine-tuning on the remaining data and selectively confuses the original model by following the distribution of a biased random generator, effectively leading the forget samples’ output space to be indistinguishable from that of the original test samples. Our algorithm is evaluated on three diverse datasets for image classification and its unlearning performance is compared against six state-of-the-art unlearning algorithms. The results show that SFTC preserves a model’s original accuracy while effectively inducing forgetting on the requested data samples.


INTRODUCTION
Machine learning models, notably those like GPT and Dall-E, have revolutionized everyday tasks in various sectors [17].Typically, these models are developed by collecting user data in a datacenter, followed by processing through machine learning pipelines [12].However, privacy regulations like the GDPR [23] require that service providers delete users' data upon request.In addition, from a security perspective, removing the influence of samples from machine learning models reduces model errors and the risk of adversarial attacks [24] like membership inference [19] and model inversion [5], which compromise service confidentiality.
To meet users' requests and comply with regulations, service providers should erase not only the associated users' data but also modify their deployed models to reflect this deletion [16,24,25].The process of making trained learning models forget in a timeefficient manner is referred to as machine unlearning [2].
The most straightforward approach for unlearning is to retrain the model from scratch without the forget set, a process called exact unlearning [22].However, this method is impractical due to its significant computational costs and time consumption.Furthermore, as data deletion requests can occur arbitrarily, retraining for each request is not feasible.Consequently, approximate unlearning [4,7,14] emerged as a key solution that modifies the original model efficiently while maintaining its predictive accuracy.
In this work, we introduce an unlearning algorithm that adjusts the original model trained on the complete dataset.Our algorithm, Selective Fine-tuning and Targeted Confusion (SFTC) utilizes a teacher-student approach and ensures that the process does not exceed 15% of the original training duration.Specifically, it fine-tunes the original model on the retain set (remaining data), while confusing it on the forget set using a biased random output generator.
Our main contributions are summarized as follows: • We propose SFTC, a novel machine unlearning algorithm that refines the original model on the retain set, while distancing its predictions on the forget set from those of the original model, using a biased random generator.
• We introduce a new forget set benchmark on the FER-2013 dataset, which includes samples from two classes and incorporates in-context information.Specifically, the forget set consists of images from minors, providing a distinct context for evaluating the unlearning process.• We evaluate SFTC on a diverse set of datasets and compare it against six unlearning algorithms.Our results suggest that SFTC effectively induces forgetting on the requested data.
The remainder is structured as follows.Section 2 outlines the concept of machine unlearning.Section 3 summarizes the related work.Section 4 introduces the SFTC algorithm.Section 5 presents the experimental results and compares SFTC with state-of-the-art unlearning algorithms.Finally, Section 6 concludes our work and discusses future directions.

PROBLEM DEFINITION
In this section, we formally define the problem of machine unlearning given a trained model, the original dataset and the specified data subset that need to be forgotten.Machine Unlearning.Let the requested set of samples that needs to be forgotten is represented as D  ⊂ D, which corresponds to a subset of the original dataset.The retain dataset (remaining data), D  , is obtained by excluding D  from D, i.e., D  = D \ D  .We assume that D  comprises a random subset from multiple classes.Given the original model  with weights  trained on D, along with the retain set D  and the forget set D  , the goal is to apply an unlearning algorithm  (•).The unlearning process modifies the original model's weights  into new weights   , resulting in a new model   .The goal for   is to unlearn D  , while maintaining high utility (e.g., high accuracy), similar to the original model.
Fig. 1 illustrates the process of machine unlearning from a service provider's point of view.Initially, users (data owners) share their data with the provider.After collection, the dataset is employed to train a machine learning model, which can generate useful predictions for customers (model consumers), who can be either data owners or other external users.Following the service deployment, a subset of data owners request the deletion of their associated information.In response, the service provider should not only remove the data from their local databases but also make the previously trained model unlearn these data.This step is crucial to comply with the "right to be forgotten" directive of regulations such as the GDPR, which also fulfills users' desiderata.

RELATED WORK
The concept of machine unlearning, introduced by Cao and Yang [2], focuses on efficient, exact data removal using summation methods based on statistical query learning.While efficient, this unlearning algorithm is limited to simple algorithms such as Naive Bayes and cannot scale to more complex models like neural networks.Bourtoule et al. developed SISA [1], which partitions the original training dataset into disjoint shards, with each shard having its own isolated sub-model.When an unlearning request arrives, only the sub-models with that sample are being retrained.However, it necessitates initial adjustments in the training phase and may not be applicable in many scenarios due to the partitioning strategy.Ginart et al. [6] proposed a method for approximate unlearning, focused on k-means clustering, through quantization and data partitioning.However, it is effective in models with a limited number of parameters, limiting its usability for neural networks.
One of the earliest works for unlearning in deep neural networks, NegGrad [8], involves adjusting the original model parameters by using gradient ascent for the forget set.Choi [3] enhanced the Neg-Grad approach by including an additional fine-tuning loss term and introduced two real-world image datasets for evaluating machine unlearning algorithms.Graves et al. [10] proposed amnesiac unlearning where the forget samples are assigned a random label and fine-tuning is performed on the concatenation of the retain and forget sets after re-labeling.Goel et al. [7] proposed two methods for unlearning, CF-k and EU-k forgetting.The former involves freezing the first k layers and performing fine-tuning using the rest model layers on the retain set and the latter starts by randomly initializing the rest layers before fine-tuning.
Our proposed algorithm builds upon Bad-Teaching [4] and is closely related to the SCRUB [14] algorithm.Bad-Teaching uses a two-teacher approach, where the retain and forget samples are relabeled to 0 and 1, respectively.Then, the student model is trained to generate a similar behavior to that of the original model on the retain set and a completely random model on the forget set.SCRUB combines fine-tuning on the retain set and minimizes the divergence between the student and original models on the retain set, while maximizing it on the forget set.Both Bad-Teaching and SCRUB try to confuse the model on the forget set with random predictions or by following a different direction.In this work, we argue that the above two methods might affect a larger number of samples than intended, particularly those in the retain or the original test set.To address this issue, we propose a method for targeted confusion on the forget set, using a controlled biased output generator.
Evaluating Machine Unlearning.One of the most controversial aspects of machine unlearning is how to evaluate an unlearned model [7,14,21].An ideal unlearning algorithm produces a model with high-quality predictions, similar to the original model, while ensuring that data are effectively forgotten.While it is straightforward to compare the unlearned model's output with the original model, proving effective unlearning is more complex.Many studies assess this by comparing the unlearned model to one retrained from scratch on the available data [4,14].However, this method is often impractical and fails to address the stochastic variability in machine learning, implying that indistinguishability between an unlearned and a retrain-from-scratch model is not a reliable unlearning indicator [7,21].In this work, we evaluate our unlearning algorithm with both approaches to ensure comprehensive assessment.

SELECTIVE FINE-TUNING AND TARGETED CONFUSION (SFTC)
In this section, we introduce SFTC, a refined unlearning algorithm based on [4].SFTC fine-tunes the original model on the retain set D  , follows the original model  trained on the entire dataset D in terms of output distribution for D  and selectively tries to diverge its predictions from  on the forget set D  by following a biased random distribution generator   .
In SFTC, the original model , with weights  , serves as the teacher model for D  and a random generator model   acts as the teacher for D  .Both models process an input sample  and produce a logit  ( ) , which is then transformed into a probability distribution via softmax activation.Our objective is to train a student model   initialized with  and yield weights   , such that   selectively forgets D  while retaining knowledge from D  .
We begin our approach by augmenting the retain and forget set with a pseudo-label  ∈ {0, 1} assignment.Specifically, samples from D  are assigned  = 0 and those from D  with  = 1.This results in augmented with pseudo-labels sets D ′  and D ′  .These sets are then combined into a unified unlearning dataset The idea for assigning pseudo-labels was influenced by the Bad-Teaching unlearning approach [4], where the authors replaced the actual labels with pseudo-labels.The unlearning dataset D  is subsequently shuffled and partitioned into batches for training.
Selective Fine-Tuning.To fulfill the predictive utility preservation requirement, the SFTC algorithm first performs a fine-tuning operation on D  .During training, the algorithm selects the samples that correspond to pseudo-label  = 0 and minimize the cross-entropy loss, defined as: where  is the number of samples,  is the number of classes,   is 1 if the ground truth class of the th sample is  and ŷ is the predicted probability of the th sample belonging to class .
Targeted Confusion.Besides fine-tuning, SFTC uses the original model  to guide the student model   towards a similar output distribution on D  and the random generator model   towards differing output distribution on D  .Similar to the Bad-Teaching approach [4], we optimize the Kullback-Leibler (KL) divergence: where  is the assigned pseudo-label,  is an input sample and  (•) () denotes the output of model  (•) for sample  after applying softmax.Note that the outputs of  (•) can be scaled according to a temperature parameter .By default,  = 1.
Optimizing the loss in Eq. 2 allows the student model to follow both teachers with respect to their output distribution on D  and D  , respectively.However, if we set a completely random model   as in [4] and we follow this random generator on the forget set,   may be confused on a larger fraction of samples.For instance, suppose that we have a specific sample that needs to be forgotten.This sample's features are very similar to that of a random sample's features belonging to the retain set.In this sense, if we make a random prediction on the forget sample, we will influence the model in making errors on similar retain samples.Consequently, this will lead the model to be confused in a larger fraction of samples than expected, which will further lead to inconsistencies.
To address this issue and effectively confuse the student model, we propose a targeted approach using a biased random output generator.The generator tailors predictions for the forget set, which can vary from being completely random to being biased towards the correct class.The degree of bias is governed by a scalar , allowing for controlling the confusion during unlearning.
More precisely, the generator creates a random output distribution for each input data, where the distribution's size is determined by the number of classes present in the original dataset.To generate the output, we sample from the normal distribution.Initially, the outputs are completely random.We then employ the scalar  to infuse a specific degree of confusion, adjusting the initial randomness in a targeted manner.When  = 1, the outputs remain entirely random, similar to the Bad-Teaching approach [4].In contrast, with  = 0, the output is carefully adjusted to align with the correct label for each sample.This adjustment involves adding a random number to the corresponding correct class index in the output distribution.By default, SFTC uses  = 0 and intuitively, the model retains the correct labels but its confidence in the forget set samples is reduced, leading to the desired targeted confusion.
Our method is flexible, allowing any level of confusion between 0 and 1, where higher values of  result in greater confusion.This concept is similar to [10], where the target labels in the forget set are assigned randomly.To achieve this, we select the batch indices corresponding to forget samples, i.e., the samples with pseudolabel  = 1.Then, we get the number of samples to change their associated label by multiplying the number of forget samples in the batch with the scalar  and selecting as many samples uniformly at random.The selected samples are assigned a new label from [0,  − 1] and the generator is biased towards these random labels.
Putting it all together, SFTC optimizes both losses (Eq. 1 and 2) to effectively induce forgetting on D  governed by the confusion fraction  while retaining high accuracy on D  :

EXPERIMENTS
In this section, we outline the experimental setup and assess the performance of different unlearning algorithms. 1

Datasets
We evaluate SFTC using three image datasets.Specifically, we consider the CIFAR-10 dataset [13], which consists of ten balanced classes with 5,000 images each.The forget set represents 10% of each class (500 images each) and is provided by Google. 2 The second dataset is MUFAC [3], comprising facial images for age group prediction.The dataset is imbalanced and the forget set mirrors the original imbalance.Finally, we present a new forget set benchmark for evaluating unlearning on the FER-2013 dataset [9], which consists of facial expressions across seven imbalanced classes.For the forget set split, we consider a scenario aligning with a (conceptual) new legislative requirement, where images tagged with fear or sadness from minors should be removed from learning models.In this scenario, the forget set is limited to a subset of only two classes.

Experimental Setup
To assess the unlearning performance we first train the ResNet-18 [11] and EfficientNet-B0 [20] models on the original datasets.
The former architecture has been thoroughly assessed in machine literature [4,7,8,14] and the latter is considered as a more complex and lightweight architecture.For model training, we use the Adam optimizer with an initial learning rate of 10 −3 and the cosine annealing scheduler for 30 epochs with a batch size of 64.All experiments were conducted five times with different initialization seeds on NVIDIA RTX 3060 GPU-equipped workstation running Ubuntu 20.04 and PyTorch 2.0.1.

Unlearning Algorithms
We compare our proposed SFTC unlearning algorithm against the following baselines and state-of-the-art approaches.Fine-Tuning (FT) is the simplest baseline, where we begin from the original model and fine-tune it on D  for a limited number of epochs.Neg-Grad+ (NG+) [3,8] combines fine-tuning on D  and maximizing the error on D  .In CF-k and EU-k Forgetting [7] the first k layers are frozen and only the last layers are being fine-tuned, where the CF-k approach begins from the original weights and EU-k randomly initializes the rest layers.Bad-Teaching (BD) [4] involves optimizing the KL loss between the student and the original model on D  and the KL loss between the student and a randomly initialized model on D  , similar to Eq. 2. SCRUB [14] performs fine-tuning on D  (Eq.1), minimizing the KL loss between the student model and the original model on D  and maximizing the KL loss on D  .Retrain (RT) represents the ideal case, where the model is trained from scratch on D  . 1 The code is available at https://github.com/vperifan/SFTC-Unlearn. 2 https://github.com/unlearning-challenge/starting-kit.

Evaluation Metrics
Unlearning Accuracy.To evaluate the unlearning accuracy, we assess the predictive performance of the unlearned model   on both the forget set D  and test set D  against a retrain-from-scratch oracle   .The accuracy error between   and   is calculated using the Symmetric Absolute Percentage Error (SAPE) [15]: Specifically, we compute the following metrics: However, since obtaining the   in real-world scenarios is often impractical, we also compare the accuracy of   on D  relative to the original model : In all of the above accuracy error metrics, lower values indicate more successful unlearning.Specifically, Eq. 5 shows the unlearning effectiveness, Eq. 6 the unlearning certifiability (similarity in performance between   and   ) and Eq. 7 evaluates post-unlearning robustness of   compared to  in terms of predictive accuracy.
Distinguishability from Original Model.Based on related literature [4,8,16,24], the unlearning algorithm should produce a model   that is similar to a retrain-from-scratch oracle   .Nevertheless, as already stated, having access to the   is impractical.Hence, we where  = 1 2  D  +  D  is the mean distribution and  D  ,  D  are the output probability distributions of   and  over D  , respectively.We choose JS over KL divergence since JS provides a symmetric and smoothed measure of the difference between two probability distributions.Intuitively, post-unlearning, the model should treat the samples of D  as unseen, similar to an independent test set.In this context, the outputs of   and  should be distinguishable.A higher value of JS indicates a greater deviation from the original, suggesting effective unlearning.
Verifiability and Privacy.To asses the verifiability/privacy aspect of machine unlearning, we perform a MIA [19] against   .We construct a balanced dataset by sampling an equal number of instances from both D  and D  (with equivalent distributions) to train a MIA predictor.Specifically, we utilize the CatBoost classifier [18] as the attacker model using as input features the logit vectors produced from   .CatBoost has been empirically proven to significantly outperform other models like Logistic Regression and Support Vector Machines with respect to MIA.The model is then applied on D  to determine how many samples are correctly identified as non training members: Higher values of this metric indicate higher privacy preservation for the forget samples and unlearning verification [24], i.e., the model's behavior on D  is similar to that of unseen samples.

Results
To provide comprehensive results, we conducted a grid search across learning rates in {5 × 10 −3 , 4 × 10 −3 , ..., 10 −5 } to identify the most effective value for each algorithm, using Eq. 5, 6 and 7 as indicators for unlearning performance.All unlearning algorithms begin from the same original model for every dataset.We maintain the default unlearning hyper-parameters, i.e., the KL temperature to one, the confusion fraction for SFTC to zero and the k parameter for both CF-k and EU-k to five.For each algorithm, we establish a range of 1 to 4 epochs for the unlearning process to ensure that no algorithm exceeds 15% of the time required for complete retraining using the Adam optimizer and a batch size of 64.We keep the unlearned models at the epoch that achieved the best overall results across the five different trials.Finally, we report the average scores obtained from the most effective setting for each algorithm.Table 1 presents the comparative analysis for each unlearning algorithm across the three considered datasets and two model architectures.The highest performing unlearning algorithm for each metric is denoted with bold and the second best with an underline.The original model's (ORI) metrics are included as a reference.
The evaluation of unlearning algorithms' efficacy requires considering all metrics as a whole, i.e., we expect a low accuracy error, high JS divergence and high MIA efficacy.For instance, if the original model remains unchanged, it exhibits no accuracy loss.However, this would also lead to no distinguishability as well as poor unlearning certifiability with respect to MIA.SFTC emerges as the most effective unlearning algorithm, having the highest quality in 12 individual cases and the second best in 14.BD is the next most successful, performing the best in 7 individual cases and second best in 9 cases.The similarity in performance between SFTC and BD is expected since they optimize the same KL term.However, SFTC's integrated mechanism of targeted confusion and selective fine-tuning further enhances unlearning effectiveness.
On the ResNet-18 architecture, SFTC consistently ranks as the top or second best algorithm across all datasets, indicating robust unlearning.BD serves as the second most effective.Other algorithms like SCRUB and EU-5, show effectiveness in specific cases, such as low accuracy loss (MUFAC-SCRUB) and high distinguishability (FER-EU-5).Nevertheless, SFTC and BD emerge as the topperforming algorithms when all metrics are considered collectively.
For the EfficientNet model, while SFTC and BD maintain their high quality unlearning performance, the results show variability across datasets.For instance, in FER and MUFAC, the FT baseline leads to high quality, having the least accuracy loss with respect to the retrain-from-scratch oracle and high MIA efficacy in FER.Yet, SFTC and BD remain the most consistently effective algorithms.
Most algorithms demonstrate high utility in terms of MIA, substantially surpassing the original model, which fails to offer any unlearning certifiability.In many cases, they also exceed the utility of the retrain-from-scratch oracle, suggesting that unlearning algorithms can also mitigate issues like overfitting and model biases.
In Fig. 4, we present the convergence of the considered unlearning algorithms using a consistent trial with the same random initialization on the ResNet-18 model.For this experiment, we terminate the unlearning algorithm when the accuracy for D  and D  closely approaches the corresponding retrain-from-scratch accuracies.
For the FT baseline across all datasets, an initial decrease in accuracy in the first epoch is evident, followed by subsequent tuning towards D  .Similar patterns are observed with the CF-5 and EU-5 approaches, where the initial epoch noticeably diverges the model from its original state.The NG+ algorithm demonstrates a consistent trend across all datasets, seemingly leading to a global forgetting, which is demonstrated by a reduction in accuracy for all sets.This suggests that NG+ induces global forgetting, rather than being limited to the D  alone.
SCRUB, in the CIFAR dataset, mirrors the NG+ pattern, reducing accuracy across all sets.In MUFAC, SCRUB initiates with a drop in all sets, subsequently elevating accuracy on D  , with test accuracy remaining consistent.On the other hand, accuracy on D  displays variability across epochs initiating with a drop, elevating closely to D  and then dropping near the retain-from-scratch target accuracy.In FER, SCRUB lowers D  accuracy across epochs while the accuracy for D  presents stability.Meanwhile, accuracy on D  demonstrates a consistent decline, with an uptick noted at epoch 4.
For the BD and SFTC algorithms, we observe a similar pattern in CIFAR, with both models lowering the D  accuracy by approximately 10% and D  accuracy by 1%, aligning with the target model's respective accuracies.In MUFAC and FER, BD lowers the predictive performance across all sets, with a higher impact on D  compared to the retrain-from-scratch model.In contrast, SFTC begins by reducing accuracy in MUFAC below the target, but by epoch 4, it surpasses BD in terms of target accuracy resemblance.In FER, SFTC's performance is akin to BD, albeit with closer D  and D  accuracies to the target.These observations suggest that following a completely random model for the forget set (BD) negatively impacts a broader sample range.Hence, adopting a biased random model towards the correct labels for the forget set (SFTC) facilitates more precise forgetting in alignment with the target model.The effectiveness of the biased random model approach will be further clarified in the subsequent sensitivity analysis study.

Sensitivity Analysis
In this section, we conduct a sensitivity analysis to assess the impact of three hyper-parameters on the training dynamics of SFTC, i.e., learning rate, KL divergence temperature () and confusion fraction ().Fig. 5 presents the results regarding unlearning accuracy on D  and D  as well as the MIA efficacy for D  (as defined in Eq. 9).We employ the ResNet-18 model, setting the number of epochs to two for CIFAR (Fig. 5a) and FER (Fig. 5c) and three for MUFAC (Fig. 5b).
Learning Rate.We begin by applying different learning rates in the range [8 × 10 −5 , 9 × 10 −5 , . . ., 5 × 10 −3 ], fixing  and c to 1 and 0, respectively (i.e., no temperature and biased output towards the correct label for D  ).For CIFAR, lower learning rates (8 × 10 −4 to 5 × 10 −4 ) are insufficient to induce forgetting since they marginally reduce the accuracy on D  .This pattern is also presented in MIA terms, where low MIA scores indicate that samples from D  are predicted as members.Conversely, learning rates between 6 × 10−4 to 10 −3 result in a desirable balance, lowering the accuracy on D  and maintaining high accuracy on D  .Similarly for MIA, there is an upward trend, indicating higher unlearning effectiveness.Learning rates above 2 × 10−3 cause a drop in both D  and D  accuracies, indicating global forgetting, not tailored towards the forget set.
In MUFAC, similar to CIFAR, there is a decline in the forget set accuracy as the learning rate increases.Nevertheless, there is no clear optimal learning rate range when considering the balance between accuracy and MIA efficacy.Learning rates between 2 × 10 −4 to 8 × 10 −4 achieve high MIA (>95%) but lower accuracy on D  compared to the retrained-from-scratch oracle.This indicates a trade-off between the (unknown) target and unlearning accuracy.An optimal setting for MUFAC, considering a real-world scenario, where the retrain-from-scratch oracle is unavailable, lies around 9 × 10 −4 , achieving balance in D  and D  accuracies (0.5465 and 0.6393, respectively) as well as high MIA (0.93).However, this setting results in a high similarity error when compared to the retrainfrom-scratch oracle.On the other hand, considering the optimal    model as the most relevant with respect to the retrain-from-scratch accuracy (4×10 −3 ), this comes at the expense of MIA.Thus, a crucial open question is whether a retrain-from-scratch model should be used as reference across diverse datasets for evaluating unlearning.
For FER, similar to MUFAC, lower learning rates result in higher MIA efficacy.We attribute this behavior to the dataset imbalance nature in contrast to the balanced CIFAR.The accuracy on D  remains stable across learning rates, demonstrating SFTC's robustness.This is similar for D  , where unlearning is induced across the ranges of learning rates.The closest alignment with the retrain-from-scratch oracle is achieved with higher learning rates (e.g., 5 × 10 −3 ), at the cost of reduced MIA efficacy, similar to MUFAC.These findings suggest the need for further investigation into machine unlearning evaluation criteria without relying on a retrain-from-scratch oracle.KL Temperature.To assess the impact of KL temperature, we conduct experiments using  ∈ [0.5, 5] with a 0.5 step and keep the learning rate for CIFAR to 7 × 10 −4 , MUFAC to 4 × 10 −3 and FER to 5 × 10 −3 .These values were the most optimal with respect to Eq. 5, 6 and 7 regarding the model's accuracy and did not consider the MIA efficacy.In all datasets, the accuracy on D  does not present high variations and remains constant across different temperature values.In CIFAR, higher  values lead to increased accuracy on D  , while in MUFAC and FER, a reverse trend is evident.Interestingly, a temperature of 1 consistently results in high MIA efficacy, suggesting that SFTC's default  = 1 is robust and effective for inducing unlearning, without needing precise temperature adjustments.
Confusion Fraction.Recall that under SFTC, the model tries to follow a biased output generation when considering  < 1 and a completely random output when  = 1 (similar to BD [4]).Our intuition was that by following a completely random output as in BD or by maximizing a loss term as in SCRUB, the model can be affected in a larger fraction of samples, not only those in D  , leading to decreased unlearning performance.Across all datasets, as  increases, the accuracy on D  decreases, while the corresponding D  remains stable with a slight decline.This highlights the potential benefits of using a biased output generator.Another interesting observation is that as MIA increases, the forget set accuracy decreases, indicating a trade-off between accuracy and MIA efficacy.This behavior is expected since the model loses more information regarding D  as accuracy decreases, thereby increasing MIA efficacy.However, the optimal unlearned model lies between these two aspects, suggesting that incorporating such information during training could lead to improved unlearning algorithms.

CONCLUSION
In this work, we present a novel algorithm that refines an original model by fine-tuning it on the retain set while selectively confusing it through a biased random generator on the forget set.Our approach is evaluated on three diverse datasets using two deep neural network architectures for image classification tasks.Our results demonstrate that SFTC effectively induces forgetting and serves as one of the most promising unlearning algorithms compared to similar methods in terms of unlearning effectiveness, certifiability and verification.In addition, we present a realistic forget set for the FER-2013 dataset, tailored to include contextual information.Our findings highlight the variability in the performance of unlearning algorithms across different dataset types (balanced vs imbalanced) and illustrate a trade-off between the effectiveness of unlearning in maintaining both high accuracy and privacy preservation.
In the future, we aim to explore the effectiveness of SFTC on additional datasets including tabular, language and graph-based data.To demonstrate the generalization and scalability of machine unlearning it is crucial to encompass diverse tasks, such as regression and recommendation.Another critical aspect is the definition of novel unlearning metrics that do not rely on a retain-from-scratch oracle, which in real-world scenarios cannot be obtained.Lastly, another dimension is to evaluate machine unlearning under differentiallyprivate models to provide insights on the dynamics of unlearning algorithms within environments that prioritize high privacy levels.

Fig. 2
presents a sample from the facial images belonging to the FER-2013 forget set.Fig. 3 illustrates the distribution of samples per class across training, validation and test sets for each dataset as well as the distribution per class in the forget sets.
Convergence on CIFAR-10.The target accuracy for the test and forget sets are 0.89 and 0.9, respectively.Convergence on MUFAC.The target accuracy for the test and forget sets are 0.59 and 0.49, respectively.Convergence on FER.The target accuracy for the test and forget sets are 0.63 and 0.21, respectively.

Figure 4 :
Figure 4: Unlearning Algorithms Convergence.The blue line corresponds to the retain set, the green line to the test set and the red line to the forget set.
Sensitivity analysis on CIFAR.
Sensitivity analysis on MUFAC.
Sensitivity analysis on FER.
measure how similar   behaves on D  compared to the original model , based on the Jensen-Shannon (JS) divergence: